Configurable Candy

Candy is a feature where you replace a selection of text with something else (usually also text), however this is done in view only and so not in the actual file. This is useful to replace things like “->” with an actual Unicode arrow while still allowing other text editors that can’t handle Unicode to display the file correctly.

Leksah implements this and allows you to configure it via a so called “Candy” file, So I “borrowed” their approach and extended it to suit my needs.

The general syntax of a Visual Haskell candy file (.vshc) is

-- "<token>" <unicode> <modifier> <enabled> <FIT|NONE>
-- the token has to be quoted
-- the supported modifiers are
-- CODE    - Apply only to regions of code
-- COMMENT - Apply only inside comments
-- STRING  - Apply only in string literals
-- ALL     - Apply to all

the modifiers are self explanatory but the FIT or NONE modifiers take some explaining.

When using the FIT modifier, the Candy engine won’t try to keep the same width as the text it’s replacing. This means that you get a layout change. The actual file might have “alpha” but the view will show only “a”.

With some things, especially with keywords we don’t want this, this is where the NONE modifier comes in. When this is used the engine will always match the width of the text it’s replacing by making the Unicode text larger and adding horizontal whitespace. This means that “alpha” would be rendered as “  a  “ and so preserving the layout.

A shot of this in action can  is:


For reference, the full default candy file that will be shipping with VSH2010 is:

— Candy file

— Format

— "<token>" <unicode> <modifier> <enabled> <FIT|NONE>

— the token has to be quoted

— the supported modifiers are

— CODE    – Apply only to regions of code

— COMMENT – Apply only inside comments

— STRING  – Apply only in string literals

— ALL     – Apply to all

— Note that the replacement block will always take up the exact same

— space as the tokens it’s replacing. e.g. "alpha" will be replaced by "  a  "

"->"         0x2192    CODE       True       NONE     –RIGHTWARDS ARROW

"<-"         0x2190    CODE       True       NONE     –LEFTWARDS ARROW

"=>"        0x21d2    CODE       True       NONE     –RIGHTWARDS DOUBLE ARROW

">="        0x2265    CODE       False      NONE     –GREATER-THAN OR EQUAL TO

"<="        0x2264    CODE       False      NONE     –LESS-THAN OR EQUAL TO

"/="          0x2260    CODE       False      NONE     –NOT EQUAL TO

"&&"        0x2227    CODE       False      NONE     –LOGICAL AND

"||"           0x2228    CODE       False      NONE     –LOGICAL OR

"++"        0x2295    CODE       False      NONE     –CIRCLED PLUS

"::"           0x2237    CODE       False      NONE     –PROPORTION

".."           0x2025    CODE       False      NONE     –TWO DOT LEADER

"^"            0x2191    COMMENT    False      NONE     –UPWARDS ARROW

"=="        0x2261    CODE       False      NONE     –IDENTICAL TO

" . "          0x2218    CODE       True       NONE     –RING OPERATOR

"\\"           0x03bb    CODE       True       NONE     –GREEK SMALL LETTER LAMBDA

"=<<"       0x291e    CODE       False      NONE     —

">>="       0x21a0    CODE       False      NONE     —

"$"           0x25ca    CODE       False      NONE     —

">>"        0x226b    CODE       False      NONE     — MUCH GREATER THEN

"forall"    0x2200    CODE       False      NONE     –FOR ALL

"exist"     0x2203    CODE       False      NONE     –THERE EXISTS

"not"       0x00ac    CODE       False      NONE     –NOT SIGN

"alpha"         0x03b1    ALL        True       FIT      –ALPHA

"beta"           0x03b2    ALL         True       FIT      –BETA

"gamma"     0x03b3    ALL        True       FIT      –GAMMA

"delta"          0x03b4    ALL        True       FIT      –DELTA

"epsilon"     0x03b5    ALL        True       FIT      –EPSILON

"zeta"           0x03b6    ALL        True       FIT      –ZETA

"eta"             0x03b7    ALL        True       FIT      –ETA

"theta"          0x03b8    ALL        True       FIT      –THETA

— Because you can configure options inside the editor itself, don’t comment out

— lines since they won’t be parsed, just change the enable flag


Visual studio has this ability to show information about symbols when you hover over them, this feature is called “QuickInfo”

This essentially means that you can hover over a symbol like “fmap” and it would tell you, fmap :: forall a b (f :: * -> *). (Functor f) => (a -> b) -> f a  -> f b and that it’s defined in GHC.Base

in ghci this would be equivalent to typing :i fmap which would result in the following output

class Functor f where
  fmap :: (a -> b) -> f a -> f b
        — Defined in GHC.Base

Whenever the user hovers over a symbol in visual studio, the IDE will call a method

public void AugmentQuickInfoSession(IQuickInfoSession session, IList<object> qiContent, out ITrackingSpan applicableToSpan)


I use the information given to me to construct two things

  • The word the user is hovering on
  • The exact location within the source file of that word

This information is used to find the correct Name value in the Haskell Renamed AST. The problem is we can’t construct name values, so we have to look them up. This is provided with the help of a typeclass

class Finder a where
    findName     :: MonadPlus m => a -> FastString -> Maybe SrcSpan -> m Name

The monad used determines how many results you receive. Use a Maybe monad and you’ll get just 1. use a List monad and you’ll get more than one, but only if you don’t specify a specific source span to look for (wildcard match on name alone).

However we should never enter the PostTcType types inside the renamed AST. These are invalid at this stage. Unfortunately SYB’s listify does not provide a way to tell it not to enter a specific type.

So we create a modified version of those SYB calls:

data Guard where
  Guard :: Typeable a => Maybe a -> Guard
type HList = [Guard]

— | Summarise all nodes in top-down, left-to-right order
everythingBut :: (r -> r -> r) -> HList -> GenericQ r -> GenericQ r
everythingBut k q f x
  = foldl k (f x) fsp
    where fsp = case isPost x q of
                  True  -> []
                  False -> gmapQ (everythingBut k q f) x

isPost :: Typeable a => a -> HList -> Bool
isPost a = or . map check
where check :: Guard -> Bool
       check x = case x of
                   Guard y -> isJust $ (cast a) `asTypeOf` y

— | Get a list of all entities that meet a predicate
listifyBut :: Typeable r => (r -> Bool) -> HList -> GenericQ [r]
listifyBut p q
  = everythingBut (++) q ([] `mkQ` (\x -> if p x then [x] else []))

Now listify takes a HList of types not to inspect. HList is a Heterogeneous list, so it’ll allow things of different types inside it. Finding the Name is now as simple as:

instance Finder (HsGroup Name) where
    findName grp a b = findName (listifyBut (isName a b) [Guard (undefined :: Maybe PostTcType)] grp) a b

once we have the names, we can just call getInfo. Nothing else is needed because remember that all API calls have a Context as argument, for instance the full type of the tooltip function is:

— @@ Export
— | Gather information about the identifier you requested
–   .
–   Context: The session for this call, Serves as a cache
–   .
–   String : The name of the identifier to lookup
–   .
–   SrcSpan: The location of the identifier in the sourcefile
–   .
–   Bool   : Whether to treat this call as a strict one. If it’s strict
–            Then the name AND span must match. If it’s not, Any match will do
–   .    
getTooltip :: Context -> String -> SrcSpan -> Bool -> IO (Maybe String)

This produces the following result


There’s a problem however, if you hover over a variable name that’s defined in the body of the function it produces a runtime panic:


If you think about it, this kind of makes sense, GHCi also won’t produce anything on local variables. In fact you can’t even refer to them. But we would at the least we would like to prevent this crash, and in the best case scenario we would like *some* information on the symbol.

After poking around some I noticed that the type of the identifiers that produce the errors are “Internal Name” values. the function nameModule then fails on these types. The plan now is, whenever we find a Internal Name, we look into the TypecheckedModule to find the Id associated with the Name value we retrieved earlier. with SYB this is again easy. However there’s a catch (thanks to nominolo for pointing this out): we should not enter any PostTcKind nor NameSet because these are blank after type checking.

findID :: Data a => a -> Name -> [Id]
findID a n = listifyBut ((n==) . getName) [Guard (undefined :: Maybe NameSet)
                                                                       ,Guard (undefined :: Maybe PostTcKind)] a

and that’s all. The end result is that this now works on local variables as well. Hovering over for instance the variable file generates


The important thing to note here is the Context , it’ll contain a cache of information. So looking up any of this stuff will be instantaneous. You just hover and directly get back information.

A last cool but *I’m not so sure how useful* function is that if you select something, then hover over it, it will type check only that expression.


so if you have an expression "fmap foo” somewhere but don’t remember what type foo or fmap is, just select them and hover over the selection. (although this is somewhat limited, all identifiers have to be top-level. It can’t return anything for local variables. sorry Sad smile )

And that’s it for this post, I’ll continue the work on Cabal now, or continue this track and fully finish intellisense.

No video for this in action, since I have a cold Confused smile

A new video

This is just an intermediate update, showing the near final UI. Cabal support is what I’m working on now and that is about 20% finished. I hope to get that done in a week or so barring any more difficulties.

If I do get cabal support I can put out a beta (without intellisense) just so I can get some feedback.

the video link is

Pardon the background noise, I’m currently in a lot of wind.


Added a screenshot for good measure, Click on the image for fullscreen.

Ghost typing

This is the preliminary version of the Ghost typing addition to visual Haskell, the idea is that whenever an explicit type signature is not given, the  IDE will display the type inferred by GHC.

You can then click on the signature to insert it, or use the smart action associated with the name of the function.

Up next is the feature that when you have given a signature that doesn’t type check, the IDE will remove that signature and retry, if it succeeds the IDE will display a suggested signature.

Below is a GIF of how the first part works.


Oh and collapsible regions has been finished as well Smile 

if the function has a type signature it will collapse at the end of that declaration, if not it’ll collapse at the end of the function name.

There is a restriction to this however since GHC allows you to declare your signatures anywhere in the file. In order for the signature to be considered part of the function by collapsible regions it has to be end on the line before the binding.

Which means it can span multiple lines, the end just has to be before the binding, that way it also supports haddock documented type signatures.

Collapsible regions.–Video Update #2

So here’s the second video update, it shows some of the current progress up till now, while it might not look significantly different but there’s a big difference under the hood. A lot has been rewritten and optimized in anticipations of new features coming soon, like intellisense and ghost typing (coming in the next video, due in a few days). Click the link below to see the video

Collapsible Regions

That’s it for today, and now… back to my thesis project.

Partial typechecking

Ok  so I just finished partial type checking, It now reports back type errors and information about a module whenever it can.

I also uploaded a video to show the most common way to navigate a file with that information. But do note a few things

  • The “look” is not final. e.g. it’s missing theming on the dropdown among others, and icons to show the type of item. These will be added next but wanted to get this out.
  • The delay at the initial check is artificially imposed. So I can debug. It usually has the information before the file loads.
  • And I have a cold, So forgive my stuffy voice 🙂

I think I’ll make my self imposed end of July deadline. Like I mentioned before, the first version aims to match features with the original visual haskell.

Context sensitive lexing

This is something I haven’t seen in other Haskell IDEs before but which to me would be useful:

Context sensitive lexing, as in the lexer wil treat certain tokens differently based on information defined globally, e.g LANGUAGE Pragmas.

But first a quick recap of how lexing is done in visual haskell 2010:

  • The IDE will ask me to color text one line at a time
  • Everytime I want to color a line I make a call to HsLexer.dll which is a binding to the GHC Api, which calls the GHC lexer directly.
  • Multiline comments are handles in a local state and are never passed to the lexer because since I’m lexing one line at a time, I won’t be able to find the boundaries of the comment blocks like that, so instead I just keep track of the comment tokens {- and –} and identify blocks using a local algorithm that mimics the matching done by GHC.
    Using that I was always able to color GHC Pragmas a different color than normal comments, the reason for this is that they have special meaning, so I’m depicting them as such.

The original code for lexing on the Haskell side was

— @@ Export
— | perform lexical analysis on the input string.
lexSourceString :: String -> IO (StatelessParseResult [Located Token])
lexSourceString source = 
   buffer <- stringToStringBuffer source
   let srcLoc  = mkSrcLoc (mkFastString "internal:string") 1 1
   let dynFlag = defaultDynFlags
   let result  = lexTokenStream buffer srcLoc dynFlag
   return $ convert result

pretty straight forward, I won’t really be explaining what everything does here, but what’s important is that we need to somehow add the LANGUAGE pragma entries into the dynFlag value above.

To that end, I created a new function

— @@ Export
— | perform lexical analysis on the input string and taking in a list of extensions to use in a newline seperated format
lexSourceStringWithExt :: String -> String -> IO (StatelessParseResult [Located Token])
lexSourceStringWithExt source exts = 
   buffer <- stringToStringBuffer source
   let srcLoc  = mkSrcLoc (mkFastString "internal:string") 1 1
   let dynFlag = defaultDynFlags
   let flagx   = flags dynFlag
   let result  = lexTokenStream buffer srcLoc (dynFlag { flags = flagx ++ configureFlags (lines exts) })
   return $ convert result

which gets the list of Pragmas to enable in a newline \n delimited format. The reason for this is that WinDll currently does not support Lists marshalling properly. It’ll be there in the final version at which point I would have rewritten these parts as well. But until then this would suffice.

the function seen above

configureFlags :: [String] -> [DynFlag]

is used to convert from the list of strings to a list of recognized DynFlag that effect lexing.

Now on to the C# side, Information I already had was the location of the multi comment sections, so all I needed to do was, on any change filter out those sections which I already know to be a Pragma (I know this because I color them differently remember)

But since the code that tracks sections is generic I did not want to hardcode this, so instead I created the following event and abstract methods

public delegate void UpdateDirtySections(object sender, Entry[] sections);
public event UpdateDirtySections DirtyChange;

/// <summary>
/// Raise the dirty section events by filtering the list with dirty spans to reflect
/// only those spans that are not the DEFAULT span
/// </summary>
protected abstract void notifyDirty();

/// <summary>
/// A redirect code for raising the internal event
/// </summary>
/// <param name="list"></param>
internal void raiseNotifyDirty(Entry[] list)
    if (DirtyChange != null)
        DirtyChange(this, list);

and the specific implementation of  notifyDirty for the CommentTracker is

protected override void notifyDirty()
    Entry[] sections = (Entry[])list.Where(x => x.isClosed && !(x.tag is CommentTag)).ToArray();

Meaning we only want those entries that are Not the normal CommentTag and that are closed, i.e. having both the start and end values filled in. (the comment tracking algorithm tracks also unclosed comment blocks, It needs to in order to do proper matching as comments get broken or introduced)

The only thing left now is to make subscribe to this event from the Tagger that produces syntax highlighting and react to it. My specific implementation does two things, It keeps track of the current collection of pragmas and the previous collection.

then it makes a call to checkNewHLE to see whether we have introduces or removed a valid syntax pragma. If this is the case, it asks for the entire file to be re-colored.

This call to checkNewHLE is important, since when the user is modifying an already existing pragma tag,

for instance adding TypeFamilies  into the pragmas {-# LANGUAGE TemplateHaskell #-} we get notified for every keypress the user makes, but untill the whole keyword TypeFamilies has been types there’s no point in re-coloring the whole file.

The result of this can be seen below and I find it very cool to be frank 😀

What it looks like with no pragmas


now look at what happens when we enable TemplateHaskell  and TypeFamilies


notice how with the extensions enabled “family” and “[|” , “|]” now behave like different keywords, this should be usefull to notify the programmer when he’s using certain features. For instance, with TypeFamilies enabled line 6 would no longer be valid because “family” is now a keyword.

Finding the current buffer’s filename

I was recently faced with the problem that in order for me to be able to send a file off to GHC for type checking and parsing (not in that order) I would need to know the full filename.

But the problem is, the only thing I have if a ITextBuffer object. Luckily, almost every object in Visual Studio 2010 has a “Properties” well, property.

So after looking around I found out that this collection contains the ITextDocument object i so desperately need. But ran into one problem. This is a dictionary so logically I would need the key of that object.

The irritation here was that the Key for this object seems to be an type, but How would I create a ITextDocument type? just using ITextDocument as a type isn’t correct, and because I just have the interface, I can’t call GetType() on it. Now I was stuck, having no idea how to construct the key.

Fortunately I realize that I would only need to look this up once, when my Tagger is initialized. So I decided to just do a linear lookup in the dictionary and select the first matching type.

It’s arguably not the way it should be done, But should be fine for my purposes, the code ended up looking like

/// <summary>
/// Finds the first value with the specified type inside the property bag.
/// This is used because I don’t know how to get the Visual Studio instantiated
/// types out of the bag. So I’m doing runtime matching. It would only be done once
/// per buffer so shouldn’t be too bad.
/// </summary>
/// <typeparam name="T">Type of the result</typeparam>
/// <param name="buffer">buffer to look in</param>
/// <returns>Object of the requested type</returns>
/// <exception cref="InvalidOperationException">Gets thrown if the type is not found inside the property dictionary</exception>
public static T getPropertyFromBuffer<T>(ITextBuffer buffer)
    foreach (var item in buffer.Properties.PropertyList)
        if (item.Value is T)
            return (T)item.Value;
    throw new InvalidOperationException("The specified type could not be found inside the property bag");

So at runtime it uses the generic type T to do lookups, a simple use of this would be

this.document = Utils.EditorUtils.getPropertyFromBuffer<ITextDocument>(this.buffer);

and that’s how I lookup my ITextDocument object 🙂

Working around ghc’s lexer’s layout rule

While implementing coloring for Haskell files I noticed that lines with more closing braces (either ‘)’ or ‘}’) were not being colored.

After doing some digging around I found out the following:











So apparently they were throwing lexical errors, but why?

After contacting Mr. Simon Marlow I was told that this is the handling of GHC’s layout rule. to quote

“You’re probably encountering the lexer’s handling of the Haskell “layout” rule.  When the lexer sees a ‘}’ token, it pops the current layout stack, and if the layout stack is empty then this is a lexical error.”

This left me with 3 choices

  1. Use a custom lexer much like the original visual haskell did
  2. Replace all {, },( and ) with 1 whitespace character so that they won’t be colored, but the rest of the input will, but the positions would be preserved.
  3. left pad the input with enough opening braces to have the lexer succeed in parsing then adjust the ranges.

Option 1 was the least maintainable, since I would have to keep updating the lexer everytime the one in ghc changes. So I didn’t want to do this.

Option 2 was a possibility, one which I tried out before, But I noticed that having the braces colored really did help.

Option 3 was then chosen by process of elimination. It turned out to not be that much work at all.

private int prepareLine(ref string str)
    int round= 0, brace = 0;

    for (int i = 0; i < str.Length; i++)
        switch (str[i])
            case ‘}’:
                if(i==0 || !(str[i-1]==’-‘))
            case ‘)’:

    if(round > 0)
        str = str.PadLeft(str.Length + round, ‘(‘);

    if (brace > 0)
        str = str.PadLeft(str.Length + brace, ‘{‘);

    return round + brace;


is the full implementation. Now I know what you’re thinking, By doing this I’ll create more opening than closing braces. So a balanced line like (Int) becomes unbalanced ((Int). However this is not a problem, Since for my coloring braces carry no semantics. I don’t care what they mean (as in, when interpreted) all I care about is what they are (as in the token type).

With that in place, the only other code needed is to skip the first n number of tokens returned from the lexer, where n is the result of calling the prepareLine function.

And that’s all, Now we have perfect line coloring everywhere 🙂


Changing default settings

The visual studio editor has a bunch of build in settings you can turn on and off per editor.

Since I’m writing a language service for Haskell, I would like to enable replacing tabs with spaces, set the amount of spaces to 4 and turn on line numbering

After a bit of searching I found a blog post from Noah Ric a developer at Microsoft about disabling zooming in a document window.

So using this as a basis I created the following:

internal class ViewCreationListener : IWpfTextViewCreationListener
    public void TextViewCreated(IWpfTextView textView)
        textView.Options.SetOptionValue(DefaultWpfViewOptions.EnableHighlightCurrentLineId, false);
        textView.Options.SetOptionValue(DefaultOptions.TabSizeOptionId, 4);
        textView.Options.SetOptionValue(DefaultOptions.ConvertTabsToSpacesOptionId, true);
        textView.Options.SetOptionValue(DefaultTextViewHostOptions.LineNumberMarginId, true);


(sorry no syntax highlighting as I can’t find a theme on here that won’t break it)

Notice the “PredefinedTextViewRoles.Zoomable” , I wanted to use PredefinedTextViewRoles.Document but when doing this the IDE would randomly throw Exceptions saying that the Editor wasn’t fully created yet. Which is odd since Visual Studio is doing all the initializations.

The set of values you can change this way are listed here:

For more things you change take a look at