Category: Snippets

F# utilities in haskell

Slowly I am getting more familiar with Haskell, but there are some things that really irk me. For example, a lot of the point free functions are right to left, instead of left to right. Coming from an F# background this drives me nuts. I want to see what happens first first not last.

For example, if we wanted to do (x+2)+3 in f#

let chained = (+) 2 >> (+) 3

Compare to haskell:

chained :: Integer -> Integer
chained = (+3) . (+2) 

In haskell, the +2 is given the argument 3, then that value is given to +3. In f# you work left to right, which I think is more readable.

Anyways, this is an easy problem to solve by defining a cusotm infix operator

(>>>) :: (a -> b) -> (b -> c) -> (a -> c)
(>>>) a b = b . a

Now we can do the same combinations as in F#.

Another thing that bugs me is pipe operator in haskell. I want to be able to pipe using the |> operator left to right (as in subject followed by verb) instead of the way haskell does it with $which is verb followed by subject.

Again, easy fix though

(|>) :: t -> (t -> b) -> b
(|>) a b = b $ a

Now we can do 1 |> (+1) and get 2. Fun!

Till functions

Just wanted to share a couple little functions that I was playing with since it made my code terse and readable. At first I needed a way to fold a function until a predicate. This way I could stop and didn’t have to continue through the whole list. Then I needed to be able to do the same kind of thing but choosing all elements up until a predicate.

Folding

First, folding. I wanted to be able to get all the characters up until white space. For example:

let (++) a b = a.ToString() + b.ToString()

let upToSpaces str = foldTill Char.IsWhiteSpace (++) "" str

Which led me to write the following fold function. Granted it’s not lazy evaluated, but for me that was OK.

let foldTill check predicate seed list= 
    let rec foldTill' acc = function
        | [] -> acc
        | (h::t) -> match check h with 
                        | false -> foldTill' (predicate acc h) t
                        | true -> acc
    foldTill' seed list

Running this gives

> upToSpaces "abcdef gh";;
val it : string = "abcdef"

Here’s a more general way of doing it for sequences. Granted it has mutable state, but its hidden in the function and never leaks. This is very similar to how fold is implemented in F# core, I just added the extra check before it calls into the fold predicate

let foldTill check predicate seed (source:seq<'a>) =     
    let finished = ref false     
    use e = source.GetEnumerator() 
    let mutable state = seed 
    while e.MoveNext() && not !finished do
        match check e.Current with 
            | false -> state <- predicate state e.Current
            | true -> finished := true
    state       

Anyways, fun!

F# class getter fun

I was playing with Neo4J (following a recent post I stumbled upon by Sergey Tihon), and had everything wired up and ready to test out, but when I tried running my code I kept getting errors saying that I hadn’t connected to the neo4j database. This puzzled me because I had clearly called connect, but every time I tried to access my connection object I got an error.

The issue was that I didn’t realize that f# class members are always deferred. It makes sense that they are after I traced through it, but I couldn’t spot the bug for the life of me at first.

My code looked like this:

module Connection = 
    type Connection (dbUrl) = 

        member x.client = new GraphClient(new Uri(dbUrl))

        member x.create item = x.client.Create item

        member x.connect() = 
            x.client.Connect()
            x

If I had more experience with F# I probably would have spotted this right away, but it took me a while to figure out what was going on. The issue here is

module Connection = 
    type Connection (dbUrl) = 

        member x.client = new GraphClient(new Uri(dbUrl))

        member x.create item = x.client.Create item

        member x.connect() = 
            x.client.Connect()
            x

Which compiles into

  [AutoOpen]
  [CompilationMapping(SourceConstructFlags.Module)]
  public static class Connection
  {
    [CompilationMapping(SourceConstructFlags.ObjectType)]
    [Serializable]
    public class Connection
    {
      internal string dbUrl;

      public GraphClient client
      {
        get
        {
          return new GraphClient(new Uri(this.dbUrl));
        }
      }

      public Connection(string dbUrl)
      {
        Connection.Connection connection = this;
        this.dbUrl = dbUrl;
      }

      public NodeReference<a> create<a>(a item) where a : class
      {
        return GraphClientExtensions.Create<a>((IGraphClient) this.client, item, new IRelationshipAllowingParticipantNode<a>[0]);
      }

      public Connection.Connection connect()
      {
        this.client.Connect();
        return this;
      }
    }
  }

Clear as day now. Each time you call the property it returns a new instance. I had assumed that since the member wasn’t a function that it would be a property, not an auto wrapped getter.

The fix was easy:

module Connection = 
    type Connection (dbUrl) = 

        let graphConnection = new GraphClient(new Uri(dbUrl))

        member x.client = graphConnection

        member x.create item = x.client.Create item

        member x.connect() = 
            x.client.Connect()
            x

Which now generates

  [AutoOpen]
  [CompilationMapping(SourceConstructFlags.Module)]
  public static class Connection
  {
    [CompilationMapping(SourceConstructFlags.ObjectType)]
    [Serializable]
    public class Connection
    {
      internal GraphClient graphConnection;

      public GraphClient client
      {
        get
        {
          return this.graphConnection;
        }
      }

      public Connection(string dbUrl)
      {
        Connection.Connection connection = this;
        this.graphConnection = new GraphClient(new Uri(dbUrl));
      }

      public NodeReference<a> create<a>(a item) where a : class
      {
        return GraphClientExtensions.Create<a>((IGraphClient) this.client, item, new IRelationshipAllowingParticipantNode<a>[0]);
      }

      public Connection.Connection connect()
      {
        this.client.Connect();
        return this;
      }
    }
  }

That’s more like it

Trees and continuation passing style

For no reason in particular I decided to revisit tree traversal as a kind of programming kata. There are two main kinds of tree traversal:

  • Depth first – This is where you go all the way down a tree’s branches first before bubbling up to do work. With a tree like below, you’d hit c before doing any work since it’s the deepest part of the tree (assuming you iterated left first then right)
         a
        / \
       b   e
     /  \
    c    d
    
  • Breadth first – This is where you hit all the nodes at the level you’re on before going further. So with the same tree, you’d hit a, then b, then e, then c and d.

Being as I actually hate tree traversal, and having to think about it, I decided that whatever I write better be extensible and clean.

Depth first

Here is a simple DFS traversal

private List<T> DepthFirstFlatten<T>(T root, Func<T, List<T>> edgeFunction) where T : class
{
    if (root == null)
    {
        return null;
    }

    var totalNodes = new List<T> { root };

    var edges = edgeFunction(root);

    if (edges != null && edges.Any())
    {
        foreach (var edge in edges)
        {
            if (edge != null)
            {
                totalNodes.AddRange(DepthFirstFlatten(edge, edgeFunction));
            }
        }
    }

    return totalNodes;
} 

In this case I’m just flattening the list and using a function to return all the edges. This way I can re-use the same depth algorithm for any kind of graph, not just a tree (assuming acyclic). To handle cycles you would need to pass the total processed nodes as an accumulator and test if the current node was already processed and if so skip it.

Breadth first

For the BFS, it’s very similar, except instead of using recursion it uses the standard iterative way of doing it with a queue:

private List<T> BreadthFlatten<T>(T root, Func<T, List<T>> edgeFunction) where T : class
{
    var queue = new Queue<T>();

    queue.Enqueue(root);

    var allNodes = new List<T>();

    while (queue.Any())
    {
        var head = queue.Dequeue();

        if (head == null)
        {
            continue;
        }

        allNodes.Add(head);

        edgeFunction(head).ForEach(queue.Enqueue );
    }

    return allNodes;
}

Same kind of deal here. This one is nice because it’s not limited by stack depth.

Also, for both traversals, if you wanted to you could pass in an action to do work each time a node was processed. Here is an example using the following tree

     1
    / \
   2   3
 /  \   \
4    5   6

Below is a small class representing a binary tree

class Node<T>
{
    public Node(T data, Node<T> left = null, Node<T> right = null)
    {
        Item = data;
        Left = left;
        Right = right;
    }

    public Node<T> Left { get; set; }
    public Node<T> Right { get; set; }

    public T Item { get; set; }
}

And our unit test to print out the different traversal types

[Test]
public void DepthFlatten()
{
    var tree = new Node<int>(1,
                            new Node<int>(2, new Node<int>(4), new Node<int>(5)),
                            new Node<int>(3, null, new Node<int>(6)));

    Func<Node<int>, List<Node<int>>> extractor = node => new List<Node<int>> {node.Left, node.Right};

    Console.WriteLine("Depth");
    DepthFirstFlatten(tree, extractor).ForEach(n => Console.WriteLine(n.Item));

    Console.WriteLine("Breadth");
    BreadthFlatten(tree, extractor).ForEach(n => Console.WriteLine(n.Item)); ;
}

Which prints out:

Depth
1
2
4
5
3
6

Breadth
1
2
3
4
5
6

DFS stack agnostic

We can even change the DFS to not use recursion in this case so that it’s agnostic of how deep the tree is. In this scenario, unlike the BFS, you’d use a stack instead of a queue. This way you are pushing on the deepest nodes and then immediately processing them. This contrasts with the queue where you enqueue the deepest nodes but process the queue FIFO (first in first out), meaning you process all the nodes at the current depth first before moving to the next depth.

private List<T> DepthFirstFlattenIterative<T>(T root, Func<T, List<T>> edgeFunction) where T : class
{
    var stack = new Stack<T>();

    stack.Push(root);

    var allNodes = new List<T>();

    while (stack.Any())
    {
        var head = stack.Pop();

        if (head == null)
        {
            continue;
        }

        allNodes.Add(head);

        var edges = edgeFunction(head);

        edges.Reverse();
        
        edges.ForEach(stack.Push);
    }

    return allNodes;
} 

The reverse is only there to be consistent with the left tree descent. Otherwise it goes down the right branch first. This spits out

Depth iterative
1
2
4
5
3
6

DFS with continuation passing

There is yet another way to do tree traversal that is common in functional languages. You can do what is called “continuation passing style”. Doing it this way you can actually get tail recursive code while iterating over multiple tree branches.

Below is some F# code to count the number of nodes in a tree. The tree I’m using as the sample looks like this

       1
     /   \ 
   2      3
 /  \      \
4    5      6

The total nodes here is 6, which is what you get with the code below.

open System

type Tree = 
    | Leaf of int
    | Node of int * Tree Option * Tree Option


let countNodes tree = 
    let rec countNodes' treeOpt cont = 
        match treeOpt with 
            | Some tree -> 
                match tree with 
                | Leaf item -> cont 1
                | Node (currentValue, left, right) ->
                    countNodes' left (fun leftCount ->
                                          countNodes' right (fun rightCount ->
                                                                 cont(1 + leftCount + rightCount)))
            | None -> cont 0
                    
    countNodes' tree id


let leftBranch = Node(2, Some(Leaf(4)), Some(Leaf(5)))

let rightBranch = Node(3, None, Some(Leaf(6)))

let tree = Node(1, Some(leftBranch), Some(rightBranch))

let treeNodeCount = countNodes (Some(tree))

But what the hell is going on here? It’s really not apparent when you first look at it what executes what and when.

The trick here is to pass around a function to each iteration that closes over what the next work should be. To be fair, its hard to wrap your mind around what is happening, so lets trace this out. I’ve highlighted each of the continuations and given them an alias so you can see how they are re-used elsewhere. Each time the continuation is called I also show the expanded form following the ->.

ContinuationPassing Trace

You can see how each iteration captures the work to do next. Eventually the very last work that needs to be done is the first function you passed in as the function seed. In this case, it’s the built in id function that returns whatever value is given to it (which turns out to be 6, which is how many nodes are in the tree). You can see the ordering of the traversal is the exact same as the other DFS traversals earlier, except this time everything is tail recursive.

SignalR on ios and a single domain

Safari on ios has a limitation that you can only have one concurrent request to a particular domain at a time. Normally this is fine, since once a request completes the next one that is queued up fires off. But what if you are using a realtime persistent connection library like signalR? In this case your one allowed connection is held up with the signalR request. If you’re not on a mac or linux and you use windows 7 or earlier you can’t use websockets so you’re stuck using http. Most suggestions involve buying a second domain, but sometimes thats not possible, especially if your application is a distributable web app that can run on client machines. You can’t expect clients to have to buy a second domain just so your realtime push works.

A nice solution, posted about in this github tracker following the issue is to configure your signalR’s long poll mechanism to work in short bursts. Instead of having a single persistent connection, you can tell the client to use the long poll transport (using typescript here)

[ts]
var signalRConfig:def.IHubConfiguration = commonUtils.MiscUtil.userIsIphone() ?
{
transport: "longPolling"
} : {};

// we can’t do signalR push with an iphone since
// iphones only allow for single connections, so having the
// forever frame interferes with other resource loading
$.connection.hub.start(signalRConfig, () => console.log("connected"));
[/ts]

And on the server, in Global.asax.cs, configure the long poll bursts

protected void Application_Start(object sender, EventArgs e)
{
    GlobalHost.Configuration.ConnectionTimeout = TimeSpan.FromMilliseconds(1000);
    LongPollingTransport.LongPollDelay = 5000;
    RouteTable.Routes.MapHubs();
}

But when I did this, per the tracker suggestion, I saw that I was still getting bursts that lasted 5 seconds. This was too long for me, since for 5 seconds you can’t do anything else. If you are doing a lot of dynamic calls (making AJAX requests, or other service calls) then 5 seconds really holds up the application.

When in doubt, read the source. I’m unfortunately using a really old version of signalR because I’m tied to that version due to how my application is distributed (and that signalR versions aren’t backwards compatible), so this advice may not apply to later versions. But, I found what the issue was.

SignalR uses a heartbeat to check when things are disconnected or timed out:

_timer = new Timer(Beat,
                   null,
                   _configurationManager.HeartBeatInterval,
                   _configurationManager.HeartBeatInterval);

//....

private void Beat(object state)
{
    //...
        foreach (var metadata in _connections.Values)
        {
            if (metadata.Connection.IsAlive)
            {
                CheckTimeoutAndKeepAlive(metadata);
            }
            else
            {
                // Check if we need to disconnect this connection
                CheckDisconnect(metadata);
            }
        }
    /...
}

The issue here is that the heartbeat is initialized to 10 seconds

public DefaultConfigurationManager()
{
    ConnectionTimeout = TimeSpan.FromSeconds(110);
    DisconnectTimeout = TimeSpan.FromSeconds(20);
    HeartBeatInterval = TimeSpan.FromSeconds(10);
    KeepAlive = TimeSpan.FromSeconds(30);
}

So, that means that no matter what you set the disconnect and connection timeouts to be, they can’t be any more granular than 10 seconds. If you remember from signal processing the shannon-nyquist sampling theorum

You know that your sampling frequency needs to be twice the target frequency. Fancy words for sample more to get more granular

The final fix I needed to add was

protected void Application_Start(object sender, EventArgs e)
{
    GlobalHost.Configuration.ConnectionTimeout = TimeSpan.FromMilliseconds(1000);
    GlobalHost.Configuration.HeartBeatInterval = TimeSpan.FromSeconds(GlobalHost.Configuration.ConnectionTimeout.TotalSeconds/2);
    LongPollingTransport.LongPollDelay = 5000;
    RouteTable.Routes.MapHubs();
}

Now that the heartbeat interval is twice as fast as the connection timeout, I properly get 1 second bursts followed by 5 seconds of down time:

2013-07-09 11_27_08-Charles 3

Why \d is slower than [0-9]

I learned an interesting thing today about regular expressions via this stackoverflow question. \d, commonly used as a shorthand for digits (which we usually think of as 0-9) actually checks against all valid unicode digits.

Given that, it makes sense why \d in a regular expression is slower, since it has to check against all possible digit types. In C# you can limit the regular expression to use ECMAScript standards which doesn’t include the full unicode subset of digits.

While I’m neither the question asker nor answerer, I wanted to share since this is something I didn’t know about before.

Bad image format “Invalid access to memory location”

Wow, two bad image format posts in one day. So, the previous post talked about debugging 64bit vs 32 bit assemblies. But after that was solved I ran into another issue. This time with the message:

Unhandled Exception: System.BadImageFormatException: Could not load file or assembly 'Interop.dll' or one of its dependencies. Invalid access to memory location. (Exception from HRESULT: 0x800703E6)
   at Program.Program.Run(Args args, Boolean fastStart)
   at Program.Program.Main(String[] args) in C:\Projects\Program.cs:line 36

Gah, what gives?

It seems that I had an interop DLL that was linking against pthreads. In debug mode, the dll worked fine on a 32 bit machine, but in release mode I’d get the error. The only difference I found was that in debug the dll was being explicity linked against pthreads lib file. Since pthread was also being built as a dynamic library, it’s lib file contains information about functions and their address, but no actual code (that’s in the dll).

After working for a while with my buddy Faisal, we decided that even though the interop dll was being built and linked properly, when it wasn’t being explicity linked to the lib file the function addresses were somehow wrong. What the compiler was actually linking against we never figured out. But, it does all make sense. If the function addresses were wrong, then when it would try and access anything in the dll it could access memory outside of its space and get an exception.

Once we explicitly linked the library as part of the linker settings of the interop dll everything worked fine.

Determining 64bit or 32 bit .NET assemblies

I work on a 64 bit machine but frequently deploy to 32 bit machines. The code I work on though has native hooks so I always need to deploy assembly entry points at 32 bit. This means I am usually paranoid about the build configuration. However, sometimes things slip up and a 64 bit dll gets sent out or an entrypoint is built with ANY CPU set. Usually this is caught on our continuous build server with some cryptic reason for a unit test that should be working is actually failing.

When this happens, what you’ll get is a message like this:

Unhandled Exception: System.BadImageFormatException: Could not load file or assembly 'Some.dll' or one of its dependencies. An attempt was made to load a program with
 an incorrect format.
   at Test.Program.Run(Args args, Boolean fastStart)
   at Test.ProgramMain(String[] args) in Program.cs:line 36

The first thing I do here is to try and figure out which of these dll’s is built at the wrong type. The easiest way I’ve found to do that is to have a simple app that reflectively loads all the assemblies in a directory and tests their image format:

class Program
{
    static void Main(string[] args)
    {
        if (args.Length != 1)
        {
            Console.WriteLine("Usage: <directory to test for dlls>");
            return;
        }

        var dir = args[0];

        Console.WriteLine();
        Console.WriteLine("This machine is {0}", Is64BitOperatingSystem ? "64 bit" : "32 bit");
        Console.WriteLine();

        foreach (var file in Directory.EnumerateFiles(dir))
        {
            if (Path.GetExtension(file) == ".dll" || Path.GetExtension(file) == ".exe")
            {
                try
                {
                    Assembly assembly = Assembly.ReflectionOnlyLoadFrom(file);
                    PortableExecutableKinds kinds;
                    ImageFileMachine imgFileMachine;
                    assembly.ManifestModule.GetPEKind(out kinds, out imgFileMachine);

                    Console.WriteLine("{0,-40} - {1,-15} - {2, -10}", 
                        Path.GetFileName(file),
                        imgFileMachine,
                        kinds);
                }
                catch (Exception ex)
                {
                    var err = "error";

                    if (ex.Message.Contains("The module was expected to contain an assembly manifest."))
                    {
                        err = "native";
                    }

                    Console.WriteLine("{0,-40} - {1,-15}", Path.GetFileName(file), err);
                }
            }
        }
    }

    public static bool Is64BitOperatingSystem
    {
        get
        {
            // Clearly if this is a 64-bit process we must be on a 64-bit OS.
            if (IntPtr.Size == 8)
                return true;
            // Ok, so we are a 32-bit process, but is the OS 64-bit?
            // If we are running under Wow64 than the OS is 64-bit.
            bool isWow64;
            return ModuleContainsFunction("kernel32.dll", "IsWow64Process") && IsWow64Process(GetCurrentProcess(), out isWow64) && isWow64;
        }
    }

    static bool ModuleContainsFunction(string moduleName, string methodName)
    {
        IntPtr hModule = GetModuleHandle(moduleName);
        if (hModule != IntPtr.Zero)
            return GetProcAddress(hModule, methodName) != IntPtr.Zero;
        return false;
    }

    [DllImport("kernel32.dll", SetLastError = true)]
    [return: MarshalAs(UnmanagedType.Bool)]
    extern static bool IsWow64Process(IntPtr hProcess, [MarshalAs(UnmanagedType.Bool)] out bool isWow64);
    [DllImport("kernel32.dll", CharSet = CharSet.Auto, SetLastError = true)]
    extern static IntPtr GetCurrentProcess();
    [DllImport("kernel32.dll", CharSet = CharSet.Auto)]
    extern static IntPtr GetModuleHandle(string moduleName);
    [DllImport("kernel32.dll", CharSet = CharSet.Ansi, SetLastError = true)]
    extern static IntPtr GetProcAddress(IntPtr hModule, string methodName);
}

Running the app will print out something like this:


This machine is 64 bit

7z.dll                                   - native
7z64.dll                                 - native
antlr.runtime.dll                        - I386            - ILOnly
Local.Common.dll                         - I386            - ILOnly, Required32Bit
BCrypt.Net.dll                           - I386            - ILOnly
ICSharpCode.SharpZipLib.dll              - I386            - ILOnly
log4net.dll                              - I386            - ILOnly
Lucene.Net.dll                           - I386            - ILOnly
nunit.framework.dll                      - I386            - ILOnly
pthreadVC2.dll                           - native
SevenZipSharp.dll                        - I386            - ILOnly
LocalInterop.dll                         - I386            - Required32Bit
swscale-0.dll                            - native
App.exe                                  - I386            - ILOnly
System.Reactive.dll                      - I386            - ILOnly
XmlDiffPatch.dll                         - I386            - ILOnly

There you go, the application was built at ANY CPU. Anything marked with ILOnly can run on both 64bit and 32bit. If it is marked as only Required32Bit then it’ll only work from a 32 bit process. Since I’m on a 64 bit machine running an ANY CPU process, the OS attempted to load the app at a 64 bit program format. This means all DLL’s it loads have to support either ANY or 64 bit. Unfortunately, the interop dll is 32 bit only so that’s what is causing the error.

If you’re wondering how I knew that exception was for native code, the native determination is due to native dll’s missing an assembly manifest. Only .NET files contain an assembly manifest so I’m just testing that specific error.

Streaming video to ios device with custom httphandler in asp.net

I ran into an interesting tidbit just now while trying to dynamically stream a video file using a custom http handler. The idea here is to bypass the static handler for a file so that I can perform authentication/preprocessing/etc when a user requests a video resource and I don’t have to expose a static folder with potentially sensitive resources.

I had everything working fine on my desktop browser, but when I went to test on my iPhone I got the dreaded play button with a circle crossed out

noVideo

I hate that thing.

Anyways, streaming a file from the static handler worked fine though, so what was the difference? This is where I pulled out charles and checked the response headers.

From the static handler I’d get this:

HTTP/1.1 200 OK
Content-Type	video/mp4
Last-Modified	Wed, 15 May 2013 20:59:30 GMT
Accept-Ranges	bytes
ETag	"9077fe17af51ce1:0"
Server	Microsoft-IIS/7.5
X-Powered-By	ASP.NET
Date	Wed, 15 May 2013 21:23:41 GMT
Content-Length	2509720

And for my dynamic handler I got this:

HTTP/1.1 200 OK
Cache-Control	private
Content-Type	video/mp4
Server	Microsoft-IIS/7.5
X-AspNet-Version	4.0.30319
X-Powered-By	ASP.NET
Date	Wed, 15 May 2013 21:22:55 GMT
Content-Length	2509720

So all that was missing was ETag and Accept-Ranges.

Turns out ETag is a good thing to have since its a CRC of the file. It tells the client if anything has actually changed or not and helps with caching. Not a bad thing to have, especially if you use a fast hashing algorithm to generate your signature.

The second thing that was different was the Accept-Ranges header. This turns out to be a way for the client to make certain range requests in case a connection is closed or something fails. From the spec:

Hypertext Transfer Protocol (HTTP) clients often encounter interrupted data transfers as a result of canceled requests or dropped connections. When a client has stored a partial representation, it is desirable to request the remainder of that representation in a subsequent request rather than transfer the entire representation. Likewise, devices with limited local storage might benefit from being able to request only a subset of a larger representation, such as a single page of a very large document, or the dimensions of an embedded image.

This document defines HTTP/1.1 range requests, partial responses, and the multipart/byteranges media type. Range requests are an optional feature of HTTP, designed so that recipients not implementing this feature (or not supporting it for the target resource) can respond as if it is a normal GET request without impacting interoperability. Partial responses are indicated by a distinct status code to not be mistaken for full responses by caches that might not implement the feature.

Although the range request mechanism is designed to allow for extensible range types, this specification only defines requests for byte ranges.

However, it’s up the server to tell the client that it supports range requests. So, once I added this into the header

context.Response.Headers.Add("Accept-Ranges", "bytes");
context.Response.Headers.Add("ETag", HashUtil.QuickComputeHash(target));
context.Response.ContentType = "video/mp4";
context.Response.TransmitFile(target);

Everything started to work. Now, I’m not actually handling range requests though. I need to test with a huge file and kill the connection and see what happens. But, even then, all that requires is testing the request headers and parsing for the byte range it wants to send out.

Turns out I’m not the only one to see this happen. Check out here and here for more info.

Capturing union values with fparsec

I just started playing with fparsec which is a parser combinatorics library that lets you create chainable parsers to parse DSL’s. After having built my own parser, lexer, and interpreter, playing with other libraries is really fun, I like seeing how others have done it. Unlike my mutable parser written in C#, with FParsec the idea is that it will encapsulate the underlying stream state and result into a parser object. Since F# is mostly immutable, this is how the underlying modified stream state gets captured and passed as a new stream to the next parser. I actually like this kind of workflow since you don’t need to create a grammar which is parsed and creates code for you (which is what ANTLR does). There’s something very appealing to have it be dynamic.

As a quick example, I was following the tutorial on the fparsec site and wanted to understand how to capture a value to store in a discriminated union. For example, if I have a type

type Token = 
     | Literal of string

How do I get a Literal("foo") created?

All of the examples I saw never looked to instantiate that. After a bit of poking around I noticed that they were using the |>> syntax which is a function that is passed the result value of the capture. So when you do

pstring "foo" |>> Literal

You’ve invoked the constructor of the discriminated union similar to this:

let literal = "foo" |> Literal 

Which is equivalent to

let literal = Literal("foo")

This is because most of the fparsec functions and overloads give you back a Parser type

type Parser<'TResult, 'TUserState> = CharStream<'TUserState> -> Reply<'TResult>

Which is just an alias for a function that takes a utf16 character stream that holds onto a user state and returns a reply that holds the value you wanted. If you look at charstream it looks similar to my simple tokenizer. The functions |>>, >>=, and >>% are all overloads that help you chain parsers and get your result back. If you are curious you can trace through their types here.

Now, if you don’t need to capture the result value and want to just create an instance of an empty union type then you can use the >>% syntax which will let you return a result:

let Token = 
    | Null
let nullTest = pstring "null" >>% Null 

There are a bunch of overloaded methods and custom operators with fparsec. For example

let Token = 
    | Null
let nullTest = stringReturn "null" Null

Is equivalent to the >>% example.

It’s a little overwhelming trying to figure out how all the combinators are pieced together, but that’s part of the fun of learning something new.