Category: Uncategorized

Thread Synchronization With Aspects

This article was originally published at tech.blinemedical.com

Aspect-oriented programming is an interesting way to decouple common method level logic into localized methods that can be applied on build. For C#, PostSharp is a great tool that does the heavy lifting of the MSIL rewrites to inject itself in and around your methods based on method tagging with attributes. PostSharp’s offerings are split up into free aspects and pro aspects so it makes diving into aspect-oriented programming easy since you can get a lot done with their free offerings.

One of their free aspects, the method interception aspect, lets you control how a method gets invoked. Using this capability, my general idea was to expose some sort of lock and wrap the method invocation automatically in lock statement using a shared object. This way, we can manage thread synchronization using aspects.

Managing thread synchronization with aspects isn’t a new idea: the PostSharp site already has an example of thread synchronization. However, they are using a pro feature aspect that allows them to auto-implement a new interface for tagged classes. For the purposes of my example, we can do the same thing without using the pro feature and simultaneously add a little extra functionality.

There are two things I wanted to accomplish. One was to simplify local method locking (basically what the PostSharp example solves), and the second was to facilitate locking of objects across multiple files and namespace boundaries. You can imagine a situation where you have two or more singletons who work on a shared resource. These objects need some sort of shared lock reference to synchronize on, which means you need to expose the synchronized object between all the classes. Not only does this tie classes together, but it can also get messy and error-prone as your application grows.

First, I’ve defined an interface that exposes a basic lock. Implementing the interface is optional as you’ll see later.

public interface IAspectLock
{
    object Lock { get; }
}

Next we have the actual aspect we’ll be tagging methods with.

[Serializable]
public class Synchronize : MethodInterceptionAspect
{
    private static readonly object FlyweightLock = new object();

    private static readonly Dictionary<string, object> LocksByName = new Dictionary<string, object>();

    private String LockName { get; set; }

    /// <summary>
    /// Constructor when using a shared lock by name
    /// </summary>
    /// <param name="lockName"></param>
    public Synchronize(String lockName)
    {
        LockName = lockName;
    }

    /// <summary>
    /// Constructor for when an object implements IAspectLock
    /// </summary>
    public Synchronize()
    {

    }

    public override void OnInvoke(MethodInterceptionArgs args)
    {
        object locker;

        if (String.IsNullOrEmpty(LockName))
        {
            var aspectLockObject = args.Instance as IAspectLock;

            if (aspectLockObject != null)
            {
                locker = aspectLockObject.Lock;
            }
            else
            {
                throw new Exception(String.Format("Method {0} didn't define a lock name nor implement IAspectLock", args.Method.Name));
            }
        }
        else
        {
            lock (FlyweightLock)
            {
                if (!LocksByName.TryGetValue(LockName, out locker))
                {
                    locker = new object();
                    LocksByName[LockName] = locker;
                }
            }
        }

        lock (locker)
        {
            args.Proceed();
        }
    }
}

The attribute can either take a string representing the name of the global lock we want to use, or, if none is provided, we can test to see if the instance implements our special interface and use its lock. When an object implements IAspectLock the code path is simple: get the lock from the object and use it on the method.

The second code path, when you use global lock name, lets you lock across the entire application without having to tie classes together, keeping things clean and decoupled.

For the scenario where a global lock name was defined, I used a static dictionary to keep track of the locks and corresponding reference objects to lock on based on name. This way I can maximize throughput by using a flyweight container: lock first on the dictionary just to get the lock I want, then lock on the value retrieved. The locking of the dictionary will always be fast and shouldn’t be contended for that often. Uncontested locks are tested for using spinlock semantics so they are usually extremely quick. Once you have the actual lock you want to use for this function, you can call args.Proceed() which will actually invoke the tagged method.

Just to be sure this all works, I wrote a unit test to make sure the attribute worked as expected. The test spawns 10,000 threads which will each loop 100,000 times and increment the _syncTest integer. The idea is to introduce a race condition. Given enough threads and enough work, some of those threads won’t get the updated value of the integer and won’t actually increment it. For example, at some point both threads may think _syncTest is 134, and both will increment to 135. If it was synchronized, the value, after two increments, should be 136. Since race conditions are timing-dependent we want to make the unit test stressful to try and maximize the probability that this would happen. Theoretically, we could run this test and never get the race condition we’re expecting, since that’s by definition a race condition (non-deterministic results). However, on my machine, I was able to consistently reproduce the expected failure conditions.

private int _syncTest = 0;
private const int ThreadCount = 10000;
private const int IterationCount = 100000;

[Test]
public void TestSynchro()
{
    var threads = new List<Thread>();
    for (int i = 0; i < ThreadCount; i++)
    {
        threads.Add(ThreadUtil.Start("SyncTester" + i, SynchroMethod));
    }

    threads.ForEach(t=>t.Join());

    Assert.True(_syncTest == ThreadCount * IterationCount,
            String.Format("Expected synchronized value to be {0} but was {1}", ThreadCount * IterationCount, _syncTest));
}

[GlobalSynchronize("SynchroMethodTest")]
private void SynchroMethod()
{
    for (int i = 0; i < IterationCount; i++)
    {
        _syncTest++;
    }
}

When the method doesn’t have the attribute we get an NUnit failure like

  Expected synchornized value to be 1000000000 but was 630198141
  Expected: True
  But was:  False

   at NUnit.Framework.Assert.That(Object actual, IResolveConstraint expression, String message, Object[] args)
   at NUnit.Framework.Assert.True(Boolean condition, String message)
   at AspectTests.TestSynchro() in AspectTests.cs: line 35

Showing the race condition that we expected did happen (the value will change each time). When we have the method synchronized, our test passes.

IxD 2013: Rhythm, Flow, and Style

This article was originally published at tech.blinemedical.com

Today Carlo and I listened to a 45 minute presentation by Peter Stahl called “Rhythm, Flow and Style” which discussed designing flow and rhythm into applications. Peter started with the observation that the world is full of rhythms, but not just what people think of (namely music). Rhythms exist in every part of life, chopping vegetables, walking outside, involving in conversations, filling out forms, navigating a website, etc. All of them involve actions, pauses, and repetitions. According to Stahl there are two types of rhythm: iterative rhythm, which is rhythm of the user (engaging with the application) and there is also motivic rhythm, which is rhythm within the application. But to get rhythm you need flow.

Flow, according to Stahl’s presentation has several dimensions:

  • Known goals, with known progress
  • Perceived balance of challenge and skill
  • Sense of control
  • Focused concentration
  • Loss of self conciouslyness: becoming one with the activity
  • Time distortion
  • Self rewarding experience

and it’s possible to induce it by offering clear goals, achievements, progressive challenges, progress tracking, and obvious next steps. The ultimate intent is to keep the user engaged and challenged, where they can easily lose themselves in the application. You can see these kinds of theories being applied to sites like StackOverflow, LinkedIn, or OKCupid. They have percentages indicating progress, helpful hints on how to get to the next step, progressively more advanced information, etc. Flow engages the user and lets the application open up, giving it a sense of movement over time.

Flow, however, isn’t always just about content and progress. It’s also a matter of visual effects and transitions. Transition effects can influence the perception of an application: fast and jarring can give a sense of precision and automation, whereas slow, more gentle fades, induces a sense of calmness.

On top of rhythm and flow is style, which is used to direct the flow and rhythm. Stahl called it rhythmic style and it is affected by visual frequency, speed, size and distance, and special effects. There are lots of different kinds of rhythmic style choices:

  • Dazzling or engaging?
  • Zippy or comfortable?
  • Dramatic or responsive?
  • Single-use or long-term?

But the choice of style within visual components depends on content branding and audience.

Stahl compared Samsung vs Vizio (though both have changed their sites since the slides shown in presentation). Samsung, at the time, used gentle fade transitions, whereas Vizio used a paper swipe effect. He argued that Samsung’s felt more human but Vizio’s was reminiscent of a copy machine: inhuman but precise. Both have advantages and with small motivic rhythm choices you can induce different emotions in the user.

Of course though, flow rests with the user, so Stahl stressed to make sure to include the user in user interaction testing. Stahl showed a four way split image showing two images of the user using the application, as well as the screen being viewed (the last screen I think was a description of the test):

stahlPresentation

Finally, Stahl ended with a resonating quote that I think captures the design goals of any application:

It’s all about the choregraphy of people’s attention. Attention is like water. It flows. It’s liquid. You create channels to divert it, and you hope that it flows the right way”.

I really enjoyed his presentation. Here are Stahls slides https://www.box.com/s/ms8ssh4nc3zl6mx0n38w.

IxD 2013 – Production ready CSS workshop

This article was originally published at tech.blinemedical.com

Carlo and I are in Toronto for Ixd 2013 for the week hoping to pick up some interesting info on interaction and design. We were painfully reminded of the need for continual design improvement within the first hour of stepping into Canada. We rented a Hyundai Veloster and for 30 minutes couldn’t figure out how to start the car. For the uninitiated, you have to hold the brake pedal down while pushing the start button.

velocsterStarter

Today the conference started and we attended a workshop called “Sitting in the drivers seat: Designing production level CSS”. Carlo’s been doing html and css for a long time and was interested to hear what the speaker would say, whereas I am less experienced and wanted to see if I could pick up any design tips. We were pleasantly surprised when Jack Moffett started and stressed that we should be focusing on blurring the lines between designer and developer. He quoted Jared Spool who said that designers should stop asking “do I need to code” but instead should ask “will learning code help me be a better designer?“. Moffett stressed using developer oriented tools such as version control–svn or git–wikis to track requirements, diff tools, bug tracking software, some sort of IDE for live editing of html and css, and of course the standard Firebug, Chrome, and even Internet Explorer developer toolsets. Using production tools that developers also use integrates the application design and development processes. With wiki’s and version control you get historical visibility about changes and decisions. This helps designers know when developers want changes, and helps developers know when designers want changes. Conversations and issues can be tracked in bug trackers like JIRA or FogBugz and gives everyone a sense of direction.

Moffett said that a benefit to following developer practices and learning at least some basic html and css structures is that designers can design with a practical implementation in mind. Understanding the limitations of html and css also helps the designer avoid implementation pitfalls and minimizes the amount of design, implement, tweak rounds during application development. Just like an architect should know the structure and the needs of the building crew, if the design is impracticable it doesn’t matter how pretty it is.

After the tool chain introduction Moffett led everyone through several examples of creating reusable CSS stressing seperating content from skin. The content should be able to stay independent, while skin elements (such as font, padding, etc) can be toggled with css. The example that I think the workshop most enjoyed was taking this:

amazonRow.

and refactoring it to this:

amazonCol.

By removing inline css and a few minor content structure tweaks. Now we can toggle between row and column layout using the same html, just with different css. I was a little surprised that Amazon’s site was used for the example – I expected at first that Moffett would show it as an example of what to do, not what to NOT do.

Moffett also worked through an example leveraging CSS inheritance to control visible state. By doing it this way you can avoid direct css style manipulations in javascript and use css classes to define visual behaviors. For example, assume we want to control visiblity of a div that contains some photos lets set up the html and css like so.

Photos aren’t visible:

<div>
  <div class="photos">
    // some photos
  </div>
</div>

Photos are visible:

<div class="showPhotos">
  <div class="photos">
    // some photos
  </div>
</div>

The css for this example would look like this:

.photos {
   display:none;
}

.showPhotos .photos{
   display:block;
}

By default the photos div isn’t visible, however if we added the showPhotos class to the parent, the parents display property will override the photos display setting it to block. By removing the showPhotos class we can hide the element since the photos display element goes back to the default. This way you can control visibility and state by toggling items at the parent level instead of the child level. Carlo also brought up in the seminar that by doing it this way you are setting it up for using css transitions and animations since you can now control state information just with css.

While it may sound trivial to UI developers or designers, it’s always good to ground yourself in how to make something simple and reusable. Even the best developers need reminders.

K-Means Step by Step in F#

This article was originally published at tech.blinemedical.com

Recently we had a discussion on clustering techniques at one of our weekly tech talks and the k-means clustering algorithm came up. K-means is considered one of the simplest unsupervised machine learning techniques and I thought it would be cool to try my hand at an F# implementation of it.

In short, k-means is a way to put data into groups based on distance between nearest neighbors. Technically it works with any dimension but for this post I’ll stick with 1-d data to make things easy.

K-means background

Imagine you have some 1-dimensional data like

1, 2, 5, 14, 17, 19, 20

And you want to group this into two groups. Pretty easily you can see that 1, 2, and 5 go into one group, and 14, 17, 19, 20 should go in another group. But why? For this small data set we intuitively figured out that 1, 2, and 5 are close together, but at the same time far away from 14, 17, 19, and 20, which are also close together. Really what we did was we calculated the centroid of each group and assigned the values to the centroid. A centroid is a fancy way of saying center point of a group. It’s calculated by taking the average of a group.

For example, if we have the group

1, 2, 5

What is its centroid? The answer is

(1 + 2 + 5) / 3 = 2.666

So with k-means you end up with k centroids and a bunch of data grouped to that centroid. It’s grouped to that centroid because that data is closest to that centroid vs any other centroid. To calculate the centroids and grouping k-means uses an iterative process:

  • Pick k random points. So if you are doing a k=2 clustering, just pick 2 random points from your data set. These are going to be our initial centroids. It doesn’t matter if it’s right, or even if the points are the same. The algorithm is going to fine tune things for us. Technically this is called the Forgy Method of initialization. There are a bunch of other ways to initialize your centroids, but Forgy seems to be pretty common.
  • For every point, calculate its distance from the centroid. So lets say we picked points 1 and 2 as our centroids. The distances then look like this:
         Δ1      Δ2
    1    0        1
    2    1        0
    5    4        3
    14   13       12
    17   16       15
    19   18       17
    20   19       18
    
  • Now what we do is assign each data point to its nearest centroid. So, obviously, point 1 is closest to the initial arbitrary point 1 centroid. And you can see that basically everything else is closer to the arbitrary point 2 centroid. So we’re left with a grouping like this now:
    data around centroid 1: 1
    data around centroid 2: 2, 5, 14, 17, 19, 20
    
  • Now, the next step is to calculate new centroids based on these groups. For this example, we can take the average of all the points together to find the new center. So the new centroids become 1 and 12.83. Now we go back to step 2 using the new centroids. If you keep track of the current centroid and the previous centroid, then you can stop the k-means calculation when centroids don’t change between iterations. At this point everything has converged and you’ve got your k clusters

Unfortunately k-means doesn’t always converge, so you can add a convergence delta (i.e. if the centroid between the current and previous iterations hasn’t really changed much, so it’s within some acceptable range) or an iteration limit (only try to cluster 100 times) so you can set bounds on the clustering.

Implementation

The first thing to do for my implementation is to define a structure representing a single data point. I created a DataPoint class that takes a float list which represents the data points dimensions. So, a 1-dimesional data point would have a float list of one element. A 2-d data point has one of two points (x and y), etc. To give credit where credit is due, I took the dimensional array representation from another F# implementation.

type DataPoint(input:float list) =
    member this.Data = input
    member this.Dimensions = List.length input
    override x.Equals(yobj) =
        match yobj with
        | :? DataPoint as y -> (x.Data = y.Data)
        | _ -> false
    static member Distance (d1:DataPoint) (d2:DataPoint) =
                                    if List.length d1.Data <> List.length d2.Data then
                                        0.0
                                    else
                                        let sums = List.fold2(fun acc d1Item d2Item ->
                                                                     acc + (d2Item - d1Item)**2.0
                                                              ) 0.0 d1.Data d2.Data
                                        sqrt(sums)

It has its data, the dimensions, an equality overload (so I can compare data points by their data and not by reference), as well as a static helper method to calculate the euclidean distance between two n-dimensional points. Here I’m using the fold2 method which can fold two lists simultaneously.

Alias data types

For fun, I wanted to use F#’s data aliasing feature, so I created some helper types to make dealing with tuples and other data pairing easier. I’ve defined the following

type Centroid = DataPoint

type Cluster = Centroid * DataPoint List

type Distance = float

type DistanceAroundCentroid = DataPoint * Centroid * Distance

type Clusters = Cluster List

type ClustersAndOldCentroids = Clusters * Centroid list

The only unfortunate thing is you need to explicitly mention aliased types for the type inference to work properly. The reason is because F# will prefer native types to typed aliases (why choose Distance over float unless you are told to), but it would’ve been nice if it had some way to prefer local type aliases in a module.

Centroid assignment

This next major step is to take the current raw data list and what the current centroids are and cluster them. A cluster is a centroid and its associated data points. To make things clearer, I created a helper function distFromCentroid which takes a point, a centroid, and returns a tuple of point * centroid * distance.

let private distFromCentroid pt centroid : DistanceAroundCentroid = (pt, centroid, (DataPoint.Distance centroid pt))

The bulk of the k-means work is in assignCentroids. First, for every data point, we calculate its distance from the current centroid. Then we select the centroid which is closest to our data points and create a list of point * centroid tuples. Once we have a list of point * centroid tuples, we want to group this list by the second value in the tuple: the centroid. Here, I’m leveraging the pipe operator to do the grouping and mapping. First, we group the lists by the centroid. This gives us a list of centroid * (datapoint * centroid) items. But this isn’t quite right yet. We want to end up with a list of centroid * dataPoint list so I passed this temporary list through some other minor filtering and formatting.

(*
    Takes the input list and the current group of centroids.
    Calculates the distance of each point from each centroid
    then assigns the data point to the centroid that is closest.
*)
let private assignCentroids (rawDataList:DataPoint list) (currentCentroids:Clusters) =
        let dataPointsWithCentroid =
            rawDataList
                |> List.map(fun pt ->
                                    let currentPointByEachCentroid = List.map(fun (centroid, _) -> distFromCentroid pt centroid) currentCentroids
                                    let (_, nearestCentroid, _) = List.minBy(fun (_, _, distance) -> distance) currentPointByEachCentroid
                                    (pt, nearestCentroid)
                            )

        (*
            |> Group all the data points by their centroid
            |> Select centroid * dataPointList
            |> For each previous centroid, calculate a new centroid based on the aggregated list
        *)

        List.toSeq dataPointsWithCentroid
            |> Seq.groupBy (fun (_, centr) -> centr)
            |> Seq.map(fun (centr, dataPointsWithCentroid) ->
                                let dataPointList = Seq.toList (Seq.map(fun (pt, _) -> pt) dataPointsWithCentroid)
                                (centr, dataPointList)
                      )
            |> Seq.map(fun (cent, dataPointList) ->
                                let newCent = calculateCentroidForPts dataPointList
                                (newCent, dataPointList)
                      )
            |> Seq.toList

Calculating new centroids

Since we’ve grouped data points by centroid, we now need to calculate what the new centroid for that list is. This is where I employ a disclaimer. For my implementation of n-dimensional centroids I averaged each dimension together to get a new value. I’m not totally sure this is actually valid for arbitrary geometrical shapes (given by n-dimensions), but if it’s not at least this is the only method that needs to be updated.

First I create an empty n-dimension array of all zeros, initialized by the dimensions of the first point in the list (all the dimensions should be the same so it doesn’t matter which point we choose). This is the seed for list for the next fold. I’m not actually going to fold the value into a single element; I’m going to fold all indexes of the data point list together into a new array by adding each dimension together.

For example, if I have two points of two dimensions [1, 2] and [3, 4] I add 1 and 3 together to make the first dimension, then 2 and 4 together to make the second dimension. This gives me an intermediate vector of [4,6]. Then I can average each dimension by the number of data points used, so in our example we used two data points, so the new centroid would be [2, 3]. Since a centroid is the same as a data point (even though one that doesn’t exist in our original raw set), we can return the new centroid as a data point. Having everything treated as data points makes the calculations easier.

(*
    take each float list representing an n-dimesional space for every data point
    and average each dimension together.  I.e. element 1 for each point gets averaged together
    and element 2 gets averaged together.  This new float list is the new nDimesional centroid
*)

let private calculateCentroidForPts (dataPointList:DataPoint List) =
    let firstElem = List.head dataPointList
    let nDimesionalEmptyList = List.init firstElem.Dimensions (fun i-> 0.0)
    let addedDimensions = List.fold(fun acc (dataPoint:DataPoint) -> addListElements acc dataPoint.Data (+))
                                nDimesionalEmptyList
                                dataPointList

    // scale the nDimensional sums by the size
    let newCentroid = List.map(fun pt -> pt/((float)(List.length dataPointList))) addedDimensions
    new DataPoint(newCentroid)

Here is the function to aggregate list elements together

(*  goes through two lists and adds each element together
    i.e. index 1 with index 1, index 2 with index2
    and returns a new list of the items aggregated
*)

let private addListElements list1 list2 operator =
    let rec addListElements' list1 list2 accumulator operator =
        if List.length list1 = 0 then
            accumulator
        else
            let head1 = List.head list1
            let head2 = List.head list2
            addListElements' (List.tail list1) (List.tail list2) ((operator head1 head2)::accumulator) operator
    addListElements' list1 list2 [] operator

I’m using a hidden accumulator so that the function will be tail recursive.

Cluster runner

The above code only does one single cluster iteration though . What we need is a clustering runner that does the clustering iterations. This runner will be responsible for determining when to stop the clustering. There are four functions here:

  • getClusters and getOldCentroids are helper methods to pull data out of the tuples (fst and snd are built in functions)
  • cluster is a basic method that will cluster as many times as necessary until the centroids stop changing
  • clusterWithIterationLimit is the bulk method that takes in the raw data, the clustering amount (k), the max number of clustering iterations, and a convergence delta to avoid infinite loops

For each iteration create a new set of clusters, then test to see if if the centroids are the same. If they are, we’ve converged and we can end. If not, see if the clusters are “close enough” (the convergence delta). The final base case is if we’ve clustered the max iterations then return.

(*
    Start with the input source and the clustering amount
    but also take in a max iteration limit as well as a converge
    delta.

    continue to cluster until the centroids stop changing or
    the iteration limit is reached OR the converge delta is achieved

    Returns a new "Clusters" type representing the centroid
    and all the data points associated with it
*)

let private getClusters (cluster:ClustersAndOldCentroids) = fst cluster
let private getOldCentroids (cluster:ClustersAndOldCentroids) = snd cluster

let clusterWithIterationLimit data k limit delta =
    let initialClusters:Clusters = initialCentroids (List.toArray data) k
                                    |> Seq.toList
                                    |> List.map (fun i -> (i,[]))

    let rec cluster' (clustersWithOldCentroids:ClustersAndOldCentroids) count =
        if count = 0 then
            getClusters clustersWithOldCentroids
        else
            let newClusters = assignCentroids data (getClusters clustersWithOldCentroids)
            let previousCentroids = getOldCentroids clustersWithOldCentroids
            let newCentroids = extractCentroidsFromClusters newClusters

            // clusters didnt change
            if listsEqual newCentroids previousCentroids then
                newClusters
            else if centroidsDeltaMinimzed previousCentroids newCentroids delta then
                newClusters
            else
                cluster' (newClusters, newCentroids) (count - 1)

    cluster' (initialClusters, extractCentroidsFromClusters initialClusters) limit

(*
    Start with the input source and the clustering amount.
    continue to cluster until the centroids stop changing
    Returns a new "Clusters" type representing the centroid
    and all the data points associated with it
*)

let cluster data k = clusterWithIterationLimit data k Int32.MaxValue 0.0

Demo

Using the sample data from the beginning of this post, let’s run it through my k-means clusterer:

module KDataTest

open System
open KMeans

let kClusterValue = 2

let sampleData : KMeans.DataPoint list = [new DataPoint([1.0]);
                                          new DataPoint([2.0]);
                                          new DataPoint([5.0]);
                                          new DataPoint([14.0]);
                                          new DataPoint([17.0]);
                                          new DataPoint([19.0]);
                                          new DataPoint([20.0])]

KMeans.cluster sampleData kClusterValue
    |> Seq.iter(KMeans.displayClusterInfo)

Console.ReadKey() |> ignore

And the output is

Centroid [2.66666666666667], with data points:
[1], [2], [5]

Centroid [17.5], with data points:
[14], [17], [19], [20]

Full source available at our github

Tracing computation expressions

This article was originally published at tech.blinemedical.com

F# has a novel syntax feature called computation expressions, which lets you build complex monadic expressions with minimal syntax. Commonly shied away from, a monad is simple: it’s a function whose input is some state. A monad usually manipulates the state and returns a new state (which can be handed off to another monad).

Monad’s are maybe best known from Haskell, but they exist in scheme, ML, clojure, scala, and even show up in C# and other imperative languages.

While the computation expression syntax is cool, short of the maybe monad, and the baked-in async keyword, I wasn’t sure what I could do with this. Thankfully, I found an interesting post by Luis Diego Fallas who posted an F# code sample leveraging computation expressions to read binary formatted files. However, if you are like me, and trying to better understand the power of computation expressions, tracing through these samples can be difficult. It’s not because they are poorly written, but mostly because they are jumping into more complex usages.

To clarify what’s going on with computation expressions, I wanted to show what is passing through the monad. Computation expressions (also known as monads or workflows) can be tricky because you are overriding the language syntax. So what is a “let!” statement in one workflow is not the same as in another workflow. On top of that statements themselves can return a value, not just have a left hand side assignment. If you feel your brain start to hurt, that’s OK. It will get better.

An example

Let’s start with a small sample to see how information passes through the workflow. This makes it easier to understand what computation expressions are doing:

open System

type State =
    | Current of int * State
    | Terminated

type Builder() =
    member this.Bind(value:int, rest:int->State) =
        State.Current((value, rest(value)))

    member this.Return(returnValue:int) = State.Current(returnValue, State.Terminated)

let builder = new Builder()

let build _ = builder{
                        let! x = 1
                        let! y = 2
                        return 3
                }

let stateChain = build()

let rec formatState (chain:State) =
            match chain with
                | State.Terminated -> Console.WriteLine()
                | State.Current(i, next) -> Console.WriteLine("State: {0}", i)
                                            formatState next

formatState stateChain

Console.ReadKey() |> ignore

Which prints out

State: 1
State: 2
State: 3

Desugaring

Before we trace through what’s happening, lets desugar the expression. If you aren’t familiar with the bang syntax, a let! statement will compile into an execution on the builders Bind function and a return will compile into an execution on the builders Return function. Let’s take the original expression:

let build _ = builder{
                        let! x = 1
                        let! y = 2
                        return 3
                }

And show the same thing but without the syntactic sugar.

let desugared =
    builder.Bind(1, fun next ->
                        let x = next // next = 1 since we took the input value of the bind,
                                     // and passed it to the next lambda

                        let returnedState =
                            builder.Bind(2, fun next2 ->
                                                let y = next2 // next2 is the value 2 here

                                                let terminatedState = builder.Return(3)

                                                //terminated state is now
                                                //State.Current(3, State.Terminated)

                                                terminatedState
                                            )

                        // the state here is
                        // State.Current(2, State.Current(3, State.Terminated))
                        returnedState
                    )

The important thing to understand here is how the let! statement is deconstructed. The right hand side is the value input to the bind. Everything below the let! statement (and including the left hand side of the assignment), is inside of the bind lambda (passed to the rest argument of the bind function). The first thing the lambda does is actually apply the left hand side assignment with the input from the bind. In this case, the first let! statement assigns x = 1. Then it executes the remaining function, being the other let! and return statements.

Tracing it through

In the original sample, I’ve defined a discriminated union called State representing the current state. This union has two types. One, called Current, contains an integer as well as a link to the next state in the form of a tuple. The other, Terminated, is a value that we can use as a terminator for our state link. The example is really just for demonstration, since by being able to capture the state it’s easier to understand how computation expressions work; it gives us a sense of where the monad has been.

Let’s take this one step at a time, with an even simpler example based on the above code. It’s important to understand the deconstruction. The compiler will translate our computation expressions into invocations on the builder.

builderDeconstruction1

To desguar it we take the left hand side and everything after

builderDeconstruction2

And move it to a lambda. This lambda is what is going to be passed as the rest argument to the builder’s bind function

builderDeconstruction3

The right hand side is going to be applied to the value argument of the bind function

builderDeconstruction4

Go back and look at how we’ve defined the bind function:

member this.Bind(value:int, rest:int->State) =
        State.Current((value, rest(value)))

Here the value argument is 1. The second argument, rest, is the lambda. The lambda is going to have to return a State union since State.Current expects an integer, State tuple.

When we execute the lambda inside the bind we pass the value (1) to the lambda, but we’ve also captured the current value. This means that this bind is going to return:

State.Current(1, rest(1))

So here we apply the value to the function

builderDeconstruction5

This is where the left hand side statement (x) now gets assigned.

builderDeconstruction6

Now what about

builder.Return(2)

Remember we defined the return function to return

member this.Return(returnValue:int) = State.Current(returnValue, State.Terminated)

So with our simplified example this will return

State.Current(2, State.Terminated)

The previous lambda now returns that same value, since that’s the last line of the statement. So we’re back now to the original bind function:

member this.Bind(value:int, rest:int->State) =
        State.Current((1, rest(1)))

But rest(1) returns State.Current(2, State.Terminated). Our final builders return value, in this example, is

State.Current(1, State.Current(2, State.Terminated))

All the computation builder syntax is doing is just sugaring our statements up to give us these broken up functions.

Back to the original sample. We added a second let! statement in there:

let build _ = builder{
                        let! x = 1
                        let! y = 2
                        return 3
                }

Now, hopefully, you should be able to see how the final return from the computation expression is a state object representing what happened in the monad:

State.Current(1, State.Current(2, State.Current(3, State.Terminated)))

Combine and Yield

Computation expressions have more than just let! and return statements though. Once you get used to tracing through and thinking about the computation builder, it becomes easier to start writing workflows. Just for kicks, I wanted to see if I could write a computation expression to evaluate a basic arithmetic expression. Here I’m using partial functions and the Combine property of the builder to build out the expression. If you wanted to, you can even use computation expressions within computation expressions. There’s nothing keeping you from doing that.

In general, there is a bunch of reserved syntax that maps to specific builder functions. The msdn on computation syntax has these all defined.

open System

type BuildTest() =
    member this.Combine(currentStatement, value) = currentStatement(value)
    member this.Return(value) = value
    member this.Yield(item) = item
    member this.Delay(item) = item()

let builder = new BuildTest()

type math() =
    member this.add x y = x + y
    member this.mult x y = x * y

let m = new math()

let build _ = builder{
                yield m.mult 2 // 2 * (1 + (2 + (3 + 0))
                yield m.add 1  // 1 + (2 + (3 + 0)
                yield m.add 2  // 2 + (3 + 0)
                yield m.add 3  // (3 + 0)
                return 0       // 0
            }

let run = build()

let monader = printf "%s" ("got " + run.ToString())

Console.ReadKey() |> ignore

This snippet evaluates to 12.

You can even add precedence by evaluating a computation expression within the computation expressions

builder{
    yield m.mult 2 // 2 * (1 + (8 + 0))
    yield m.add 1  // 1 + (8 + 0)

    let parenth = builder{
                        yield m.mult 4 // 4 * 2
                        return 2  // 2
                    }

    yield m.add parenth // 8 + 0

    return 0       // 0
}

Which evaluates to 18.

Tracing Combine and Yield

Just like before, there’s a bunch of magic going on here, so it’s easier if you follow along with the desugared version of the original arithmetic expression below.

let desugared = builder.Delay(
                fun () -> builder.Combine(builder.Yield(m.mult 2),
                    builder.Delay(
                        fun() -> builder.Combine(builder.Yield(m.add 1),
                                builder.Delay(
                                    fun() -> builder.Combine(builder.Yield(m.add 2),
                                            builder.Delay(
                                                fun() -> builder.Combine(builder.Yield(m.add 3),
                                                    builder.Delay(
                                                        fun() -> builder.Return(0))))))))))

Each monadic function is wrapped in a Delay, which promptly executes it. Look at the builder’s delay declaration – it takes a function and executes it.

In our builder, the Yield just returns the same value. It doesn’t do much but we needed to implement it to use the computation expression syntax.

What we pass to the delay is an anonymous function that has a combine statement. Combines take two things and produce a third. Here, we are passing the current partial function as the first argument (via the yield), and the value we want to use to evaluate this partial function as the second argument. However, the second argument isn’t actually evaluated till the end. The combine will then apply the second argument (an integer) to the first argument (a partial function that takes an integer).

For the basic arithmetic example, the final delay function returns 0. You can think of this as a “seed.” If you think of it like a fold operation, this is very similar. When we finally return the seed, we bubble each evaluated expression back up the stack (starting with 0), so read the desugared version from the bottom up. In this way, we are executing the current curried statement with the previous statements evaluated value in the Combine method of the builder. Not the most practical application, but I thought it was a fun exercise.

If you are confused why this example’s desguaring contains the Delay method and the original example didn’t, it’s because the sugaring happens differently depending which builder constructs you use.

Under the hood

When you decompile a computation expression, each monadic function gets compiled into it’s own class representing a monad. In our arithmetic operation example, this is a Combine, Yield, Delay trio. It’s not easy to read since the function names have been mangled, but you can see the general pattern here (formatting added).

[Serializable]
internal class runu004089 : FSharpFunc<Unit, int>
{
    internal runu004089()
    {
    }

    public override int Invoke(Unit unitVar0)
    {
        return ExpresionsTest.builder.Combine<int, int>(
        ExpresionsTest.builder.Yield<FSharpFunc<int, int>>(
            (FSharpFunc<int, int>) new ExpresionsTest.runu004089u002D1(2, ExpresionsTest.m)),
                ExpresionsTest.builder.Delay<int>(
                    (FSharpFunc<Unit, int>) new ExpresionsTest.runu004091u002D2()));
    }
}

This decompliation represents the following sub-block.

builder.Combine(
    builder.Yield(m.add 2), builder.Delay( (*next function*) )
)

Notice in the decompiled block that the class name is runu004089 and the executable expression is compiled into an Invoke that returns an int. The decompiled assembly will actually be littered with these classes with mangled names, following a naming format of the target variable name (run) and an identifier (u004089). You can always decompile the computation expression to get a sense for how it’s been desugared.

Conclusion

I said in the beginning that monads are simple, but I’ll admit that I lied. Monads are tricky, there’s no denying that. Maybe that’s why there is no shortage of blog posts trying to explain the monad over and over again. But, in the end, once you wrap your head around it, I think computation expression syntax is a cool way of using the concept of a monad by decoupling what something is defined to do, vs how it’s actually executed.

I highly suggest running the examples and actually stepping through them bit by bit if you are having trouble following what is happening. Being able to see a desugared version of the code and using a debugger to inspect locals while stepping through examples makes it a lot clearer to see whats happening.

More reading

If you are curious here are some links to further reading explaining monads and computation expressions (such as the F# wikibook). Don Syme also has a few posts explaining things really well that are definitely worth checking out.