Tagged: parser

ParsecClone on nuget

Today I published the first version of ParsecClone to nuget. I blogged recently about creating my own parser combinator and it’s come along pretty well. While FParsec is more performant and better optimized, mine has other advantages (such as being able to work on arbitrary consumption streams such as binary or bit level) and work directly on strings with regex instead of character by character. Though I wouldn’t recommend using ParsecClone for production string parsing if you have big data sets, since the string parsing isn’t streamed. It works directly on a string. That’s still on the todo list, however the binary parsing does work on streams.

Things included:

  • All your favorite parsec style operators: < |>, ., ., |, etc. I won’t list them all since there are a lot.
  • String parsing. Match on full string terms, do regular expression parsing, inverted regular expressions, etc. I
Read more
Locale parser with fparsec

Localizing an application consists of extracting out user directed text and managing it outside of hardcoded strings in your code. This lets you tweak strings without having to recompile, and if done properly, allows you to support multiple languages. Localizing is no easy task, it messes up spacing, formatting, name/date other cultural information, but thats a separate issue. The crux of localizing is text.

But, who just uses bare text to display things to the user? Usually you want to have text be a little dynamic. Something like

Hello {user}! Welcome!

Here, user will be some sort of dynamic property. To support this, your locale files need a way to handle arguments.

One way of storing contents in a locale file is like this:

ExampleText = Some Text {argName:argType} other text etc
            = This is on a seperate newline
UserLoginText = ... 

This consists of an identifier, followed by an … Read more

Just another brainfuck interpreter

Why?

Honestly, why not?

The entry point

Not much to tell:

static void Main(string[] args)
{
    var parser = new Parser("++++++++++[>+++++++>++++++++++>+++>+<<<<-]>++.>+.+++++++..+++.>++.<<+++++++++++++++.>.+++.------.--------.>+.>.");

    var instructions = parser.Instructions;

    var interpreter = new Interpreter(instructions);

    interpreter.Interpret();
}

The container classes

Some data classes and enums:

public enum Tokens
    {
        MoveFwd,
        MoveBack,
        Incr,
        Decr,
        While,
        Print,
        Input,
        WhileEnd,
        WhileStart,
        Unknown
    }

    public class Instruction
    {
        public Tokens Token { get; set; }

        public override string ToString()
        {
            return Token.ToString();
        }
    }

    class While : Instruction
    {
        public While()
        {
            Token = Tokens.While; 
        }

        public List<Instruction> Instructions { get; set; } 
    }

A helper function

A function to translate a character token into a known token

private Tokens GetToken(char input)
{
    switch (input)
    {
        case '+':
            return Tokens.Incr;
        case '-':
            return Tokens.Decr;;
        case '<':
            return Tokens.MoveBack;
        case '>':
            return Tokens.MoveFwd;
        case '.':
            return Tokens.Print;
        case ',':
            return Tokens.Input;
        case '[':
            return Tokens.WhileStart;
        case ']':
            return Tokens.WhileEnd;
    }
    
Read more
A handrolled language parser

In my previous post about building a custom lexer I mentioned that, for educational purposes, I created a simple toy programming language (still unnamed). There, I talked about building a tokenizer and lexer from scratch. In this post I’ll discuss building a parser that is responsible for generating an abstract syntax tree (AST) for my language. This syntax tree can then be passed to other language components such as a scope and type resolver, and finally an interpreter.

The parser I made is a recursive descent packrat parser that uses backtracking. Short of memoizing found AST, there aren’t any other real optimizations. The goal was to create a working parser, not a production parser to distribute or use (or reuse) in any professional sense. Like the lexer, this is an academic exercise to try and hit on some of the points covered by Terence Parr’s Language Implementation Patterns book that … Read more