Working around ghc’s lexer’s layout rule

While implementing coloring for Haskell files I noticed that lines with more closing braces (either ‘)’ or ‘}’) were not being colored.

After doing some digging around I found out the following:











So apparently they were throwing lexical errors, but why?

After contacting Mr. Simon Marlow I was told that this is the handling of GHC’s layout rule. to quote

“You’re probably encountering the lexer’s handling of the Haskell “layout” rule.  When the lexer sees a ‘}’ token, it pops the current layout stack, and if the layout stack is empty then this is a lexical error.”

This left me with 3 choices

  1. Use a custom lexer much like the original visual haskell did
  2. Replace all {, },( and ) with 1 whitespace character so that they won’t be colored, but the rest of the input will, but the positions would be preserved.
  3. left pad the input with enough opening braces to have the lexer succeed in parsing then adjust the ranges.

Option 1 was the least maintainable, since I would have to keep updating the lexer everytime the one in ghc changes. So I didn’t want to do this.

Option 2 was a possibility, one which I tried out before, But I noticed that having the braces colored really did help.

Option 3 was then chosen by process of elimination. It turned out to not be that much work at all.

private int prepareLine(ref string str)
    int round= 0, brace = 0;

    for (int i = 0; i < str.Length; i++)
        switch (str[i])
            case ‘}’:
                if(i==0 || !(str[i-1]==’-‘))
            case ‘)’:

    if(round > 0)
        str = str.PadLeft(str.Length + round, ‘(‘);

    if (brace > 0)
        str = str.PadLeft(str.Length + brace, ‘{‘);

    return round + brace;


is the full implementation. Now I know what you’re thinking, By doing this I’ll create more opening than closing braces. So a balanced line like (Int) becomes unbalanced ((Int). However this is not a problem, Since for my coloring braces carry no semantics. I don’t care what they mean (as in, when interpreted) all I care about is what they are (as in the token type).

With that in place, the only other code needed is to skip the first n number of tokens returned from the lexer, where n is the result of calling the prepareLine function.

And that’s all, Now we have perfect line coloring everywhere 🙂


GHC dll sizes

Just as a side note, for those who find the size of the shared libs generated from ghc to be very large strip and upx can do wonders.

strip -g -S –strip-unneeded -v HsLexer.dll

strips about 50% and

upx -9 -v HsLexer.dll

compresses another 80% bringing the size from HsLexer from 48mb to 4.8mb.

not bad, not bad indeed, and with virtually no overhead.

UPDATE: Corrected the numbers based on what Cippo mentioned. Thanks, Didn’t check them before posting

Changing default settings

The visual studio editor has a bunch of build in settings you can turn on and off per editor.

Since I’m writing a language service for Haskell, I would like to enable replacing tabs with spaces, set the amount of spaces to 4 and turn on line numbering

After a bit of searching I found a blog post from Noah Ric a developer at Microsoft about disabling zooming in a document window.

So using this as a basis I created the following:

internal class ViewCreationListener : IWpfTextViewCreationListener
    public void TextViewCreated(IWpfTextView textView)
        textView.Options.SetOptionValue(DefaultWpfViewOptions.EnableHighlightCurrentLineId, false);
        textView.Options.SetOptionValue(DefaultOptions.TabSizeOptionId, 4);
        textView.Options.SetOptionValue(DefaultOptions.ConvertTabsToSpacesOptionId, true);
        textView.Options.SetOptionValue(DefaultTextViewHostOptions.LineNumberMarginId, true);


(sorry no syntax highlighting as I can’t find a theme on here that won’t break it)

Notice the “PredefinedTextViewRoles.Zoomable” , I wanted to use PredefinedTextViewRoles.Document but when doing this the IDE would randomly throw Exceptions saying that the Editor wasn’t fully created yet. Which is odd since Visual Studio is doing all the initializations.

The set of values you can change this way are listed here:

For more things you change take a look at

Hello World!

I’m currently writing a language service/package for visual studio 2010 but one of the things I’ve noticed is that a lot of the stuff I need are largely undocumented.

So I decided to write this Blog, documenting things I’ve found out while writing this. Although now and then there’ll be some odd posts 🙂