Tamar Christina 04:07 on August 9, 2010
Tags: linkedin ( 5 )

QuickInfo

Visual studio has this ability to show information about symbols when you hover over them, this feature is called “QuickInfo”

This essentially means that you can hover over a symbol like “fmap” and it would tell you, fmap :: forall a b (f :: * -> *). (Functor f) => (a -> b) -> f a -> f b and that it’s defined in GHC.Base

in ghci this would be equivalent to typing :i fmap which would result in the following output

class Functor f where
fmap :: (a -> b) -> f a -> f b
…
— Defined in GHC.Base

Whenever the user hovers over a symbol in visual studio, the IDE will call a method

public void AugmentQuickInfoSession(IQuickInfoSession session, IList<object> qiContent, out ITrackingSpan applicableToSpan)

I use the information given to me to construct two things

The word the user is hovering on
The exact location within the source file of that word

This information is used to find the correct Name value in the Haskell Renamed AST. The problem is we can’t construct name values, so we have to look them up. This is provided with the help of a typeclass

class Finder a where
findName :: MonadPlus m => a -> FastString -> Maybe SrcSpan -> m Name

The monad used determines how many results you receive. Use a Maybe monad and you’ll get just 1. use a List monad and you’ll get more than one, but only if you don’t specify a specific source span to look for (wildcard match on name alone).

However we should never enter the PostTcType types inside the renamed AST. These are invalid at this stage. Unfortunately SYB’s listify does not provide a way to tell it not to enter a specific type.

So we create a modified version of those SYB calls:

data Guard where
Guard :: Typeable a => Maybe a -> Guard

type HList = [Guard]

— | Summarise all nodes in top-down, left-to-right order
everythingBut :: (r -> r -> r) -> HList -> GenericQ r -> GenericQ r
everythingBut k q f x
= foldl k (f x) fsp
    where fsp = case isPost x q of
                  True -> []
                  False -> gmapQ (everythingBut k q f) x

isPost :: Typeable a => a -> HList -> Bool
isPost a = or . map check
where check :: Guard -> Bool
       check x = case x of
                   Guard y -> isJust $ (cast a) `asTypeOf` y

— | Get a list of all entities that meet a predicate
listifyBut :: Typeable r => (r -> Bool) -> HList -> GenericQ [r]
listifyBut p q
= everythingBut (++) q ([] `mkQ` (\x -> if p x then [x] else []))

Now listify takes a HList of types not to inspect. HList is a Heterogeneous list, so it’ll allow things of different types inside it. Finding the Name is now as simple as:

instance Finder (HsGroup Name) where
findName grp a b = findName (listifyBut (isName a b) [Guard (undefined :: Maybe PostTcType)] grp) a b

once we have the names, we can just call getInfo. Nothing else is needed because remember that all API calls have a Context as argument, for instance the full type of the tooltip function is:

— @@ Export
— | Gather information about the identifier you requested
–   .
–   Context: The session for this call, Serves as a cache
–   .
–   String : The name of the identifier to lookup
–   .
–   SrcSpan: The location of the identifier in the sourcefile
–   .
–   Bool   : Whether to treat this call as a strict one. If it’s strict
–            Then the name AND span must match. If it’s not, Any match will do
–   .
getTooltip :: Context -> String -> SrcSpan -> Bool -> IO (Maybe String)

This produces the following result

There’s a problem however, if you hover over a variable name that’s defined in the body of the function it produces a runtime panic:

If you think about it, this kind of makes sense, GHCi also won’t produce anything on local variables. In fact you can’t even refer to them. But we would at the least we would like to prevent this crash, and in the best case scenario we would like *some* information on the symbol.

After poking around some I noticed that the type of the identifiers that produce the errors are “Internal Name” values. the function nameModule then fails on these types. The plan now is, whenever we find a Internal Name, we look into the TypecheckedModule to find the Id associated with the Name value we retrieved earlier. with SYB this is again easy. However there’s a catch (thanks to nominolo for pointing this out): we should not enter any PostTcKind nor NameSet because these are blank after type checking.

findID :: Data a => a -> Name -> [Id]
findID a n = listifyBut ((n==) . getName) [Guard (undefined :: Maybe NameSet)
,Guard (undefined :: Maybe PostTcKind)] a

and that’s all. The end result is that this now works on local variables as well. Hovering over for instance the variable file generates

The important thing to note here is the Context , it’ll contain a cache of information. So looking up any of this stuff will be instantaneous. You just hover and directly get back information.

A last cool but *I’m not so sure how useful* function is that if you select something, then hover over it, it will type check only that expression.

so if you have an expression "fmap foo” somewhere but don’t remember what type foo or fmap is, just select them and hover over the selection. (although this is somewhat limited, all identifiers have to be top-level. It can’t return anything for local variables. sorry )

And that’s it for this post, I’ll continue the work on Cabal now, or continue this track and fully finish intellisense.

No video for this in action, since I have a cold

Tamar Christina 17:57 on May 19, 2010

Context sensitive lexing

This is something I haven’t seen in other Haskell IDEs before but which to me would be useful:

Context sensitive lexing, as in the lexer wil treat certain tokens differently based on information defined globally, e.g LANGUAGE Pragmas.

But first a quick recap of how lexing is done in visual haskell 2010:

The IDE will ask me to color text one line at a time
Everytime I want to color a line I make a call to HsLexer.dll which is a binding to the GHC Api, which calls the GHC lexer directly.
Multiline comments are handles in a local state and are never passed to the lexer because since I’m lexing one line at a time, I won’t be able to find the boundaries of the comment blocks like that, so instead I just keep track of the comment tokens {- and –} and identify blocks using a local algorithm that mimics the matching done by GHC.

Using that I was always able to color GHC Pragmas a different color than normal comments, the reason for this is that they have special meaning, so I’m depicting them as such.

The original code for lexing on the Haskell side was

— @@ Export
— | perform lexical analysis on the input string.
lexSourceString :: String -> IO (StatelessParseResult [Located Token])
lexSourceString source =
do
   buffer <- stringToStringBuffer source
   let srcLoc = mkSrcLoc (mkFastString "internal:string") 1 1
   let dynFlag = defaultDynFlags
   let result = lexTokenStream buffer srcLoc dynFlag
   return $ convert result

pretty straight forward, I won’t really be explaining what everything does here, but what’s important is that we need to somehow add the LANGUAGE pragma entries into the dynFlag value above.

To that end, I created a new function

— @@ Export
— | perform lexical analysis on the input string and taking in a list of extensions to use in a newline seperated format
lexSourceStringWithExt :: String -> String -> IO (StatelessParseResult [Located Token])
lexSourceStringWithExt source exts =
do
   buffer <- stringToStringBuffer source
   let srcLoc = mkSrcLoc (mkFastString "internal:string") 1 1
   let dynFlag = defaultDynFlags
   let flagx   = flags dynFlag
   let result = lexTokenStream buffer srcLoc (dynFlag { flags = flagx ++ configureFlags (lines exts) })
   return $ convert result

which gets the list of Pragmas to enable in a newline \n delimited format. The reason for this is that WinDll currently does not support Lists marshalling properly. It’ll be there in the final version at which point I would have rewritten these parts as well. But until then this would suffice.

the function seen above

configureFlags :: [String] -> [DynFlag]

is used to convert from the list of strings to a list of recognized DynFlag that effect lexing.

Now on to the C# side, Information I already had was the location of the multi comment sections, so all I needed to do was, on any change filter out those sections which I already know to be a Pragma (I know this because I color them differently remember)

But since the code that tracks sections is generic I did not want to hardcode this, so instead I created the following event and abstract methods

public delegate void UpdateDirtySections(object sender, Entry[] sections);
public event UpdateDirtySections DirtyChange;

/// <summary>
/// Raise the dirty section events by filtering the list with dirty spans to reflect
/// only those spans that are not the DEFAULT span
/// </summary>
protected abstract void notifyDirty();

/// <summary>
/// A redirect code for raising the internal event
/// </summary>
/// <param name="list"></param>
internal void raiseNotifyDirty(Entry[] list)
{
if (DirtyChange != null)
DirtyChange(this, list);
}

and the specific implementation of notifyDirty for the CommentTracker is

protected override void notifyDirty()
{
Entry[] sections = (Entry[])list.Where(x => x.isClosed && !(x.tag is CommentTag)).ToArray();
base.raiseNotifyDirty(sections);
}

Meaning we only want those entries that are Not the normal CommentTag and that are closed, i.e. having both the start and end values filled in. (the comment tracking algorithm tracks also unclosed comment blocks, It needs to in order to do proper matching as comments get broken or introduced)

The only thing left now is to make subscribe to this event from the Tagger that produces syntax highlighting and react to it. My specific implementation does two things, It keeps track of the current collection of pragmas and the previous collection.

then it makes a call to checkNewHLE to see whether we have introduces or removed a valid syntax pragma. If this is the case, it asks for the entire file to be re-colored.

This call to checkNewHLE is important, since when the user is modifying an already existing pragma tag,

for instance adding TypeFamilies into the pragmas {-# LANGUAGE TemplateHaskell #-} we get notified for every keypress the user makes, but untill the whole keyword TypeFamilies has been types there’s no point in re-coloring the whole file.

The result of this can be seen below and I find it very cool to be frank 😀

What it looks like with no pragmas

now look at what happens when we enable TemplateHaskell and TypeFamilies

notice how with the extensions enabled “family” and “[|” , “|]” now behave like different keywords, this should be usefull to notify the programmer when he’s using certain features. For instance, with TypeFamilies enabled line 6 would no longer be valid because “family” is now a keyword.

Tamar Christina 20:55 on May 18, 2010

Finding the current buffer’s filename

I was recently faced with the problem that in order for me to be able to send a file off to GHC for type checking and parsing (not in that order) I would need to know the full filename.

But the problem is, the only thing I have if a ITextBuffer object. Luckily, almost every object in Visual Studio 2010 has a “Properties” well, property.

So after looking around I found out that this collection contains the ITextDocument object i so desperately need. But ran into one problem. This is a dictionary so logically I would need the key of that object.

The irritation here was that the Key for this object seems to be an type, but How would I create a ITextDocument type? just using ITextDocument as a type isn’t correct, and because I just have the interface, I can’t call GetType() on it. Now I was stuck, having no idea how to construct the key.

Fortunately I realize that I would only need to look this up once, when my Tagger is initialized. So I decided to just do a linear lookup in the dictionary and select the first matching type.

It’s arguably not the way it should be done, But should be fine for my purposes, the code ended up looking like

/// <summary>
/// Finds the first value with the specified type inside the property bag.
/// This is used because I don’t know how to get the Visual Studio instantiated
/// types out of the bag. So I’m doing runtime matching. It would only be done once
/// per buffer so shouldn’t be too bad.
/// </summary>
/// <typeparam name="T">Type of the result</typeparam>
/// <param name="buffer">buffer to look in</param>
/// <returns>Object of the requested type</returns>
/// <exception cref="InvalidOperationException">Gets thrown if the type is not found inside the property dictionary</exception>
public static T getPropertyFromBuffer<T>(ITextBuffer buffer)
{
    foreach (var item in buffer.Properties.PropertyList)
    {
        if (item.Value is T)
            return (T)item.Value;
    }
    throw new InvalidOperationException("The specified type could not be found inside the property bag");
}

So at runtime it uses the generic type T to do lookups, a simple use of this would be

this.document = Utils.EditorUtils.getPropertyFromBuffer<ITextDocument>(this.buffer);

and that’s how I lookup my ITextDocument object 🙂

Tamar Christina 19:50 on April 29, 2010

Working around ghc’s lexer’s layout rule

While implementing coloring for Haskell files I noticed that lines with more closing braces (either ‘)’ or ‘}’) were not being colored.

After doing some digging around I found out the following:

?parseLine("{")->tag

cStatelessParseResultSOk

?parseLine("}")->tag

cStatelessParseResultSFailed

and

?parseLine("{-")->tag

cStatelessParseResultSOk

?parseLine("-}")->tag

cStatelessParseResultSFailed

So apparently they were throwing lexical errors, but why?

After contacting Mr. Simon Marlow I was told that this is the handling of GHC’s layout rule. to quote

“You’re probably encountering the lexer’s handling of the Haskell “layout” rule. When the lexer sees a ‘}’ token, it pops the current layout stack, and if the layout stack is empty then this is a lexical error.”

This left me with 3 choices

Use a custom lexer much like the original visual haskell did
Replace all {, },( and ) with 1 whitespace character so that they won’t be colored, but the rest of the input will, but the positions would be preserved.
left pad the input with enough opening braces to have the lexer succeed in parsing then adjust the ranges.

Option 1 was the least maintainable, since I would have to keep updating the lexer everytime the one in ghc changes. So I didn’t want to do this.

Option 2 was a possibility, one which I tried out before, But I noticed that having the braces colored really did help.

Option 3 was then chosen by process of elimination. It turned out to not be that much work at all.

private int prepareLine(ref string str)
{
    int round= 0, brace = 0;

    for (int i = 0; i < str.Length; i++)
    {
        switch (str[i])
        {
            case ‘}’:
                if(i==0 || !(str[i-1]==’-‘))
                    brace++;
                break;
            case ‘)’:
                round++;
                break;
            default:
                break;
        }
    }

    if(round > 0)
        str = str.PadLeft(str.Length + round, ‘(‘);

    if (brace > 0)
        str = str.PadLeft(str.Length + brace, ‘{‘);

    return round + brace;
}

is the full implementation. Now I know what you’re thinking, By doing this I’ll create more opening than closing braces. So a balanced line like (Int) becomes unbalanced ((Int). However this is not a problem, Since for my coloring braces carry no semantics. I don’t care what they mean (as in, when interpreted) all I care about is what they are (as in the token type).

With that in place, the only other code needed is to skip the first n number of tokens returned from the lexer, where n is the result of calling the prepareLine function.

And that’s all, Now we have perfect line coloring everywhere 🙂

Tamar Christina 18:25 on April 28, 2010

Changing default settings

The visual studio editor has a bunch of build in settings you can turn on and off per editor.

Since I’m writing a language service for Haskell, I would like to enable replacing tabs with spaces, set the amount of spaces to 4 and turn on line numbering

After a bit of searching I found a blog post from Noah Ric a developer at Microsoft about disabling zooming in a document window.

http://blogs.msdn.com/noahric/archive/2010/03/18/disabling-mouse-wheel-zoom-through-ieditoroptions.aspx

So using this as a basis I created the following:

[Export(typeof(IWpfTextViewCreationListener))]
[ContentType("haskell")]
[TextViewRole(PredefinedTextViewRoles.Zoomable)]
internal class ViewCreationListener : IWpfTextViewCreationListener
{
    public void TextViewCreated(IWpfTextView textView)
    {
        textView.Options.SetOptionValue(DefaultWpfViewOptions.EnableHighlightCurrentLineId, false);
        textView.Options.SetOptionValue(DefaultOptions.TabSizeOptionId, 4);
        textView.Options.SetOptionValue(DefaultOptions.ConvertTabsToSpacesOptionId, true);
        textView.Options.SetOptionValue(DefaultTextViewHostOptions.LineNumberMarginId, true);
    }
}

(sorry no syntax highlighting as I can’t find a theme on here that won’t break it)

Notice the “PredefinedTextViewRoles.Zoomable” , I wanted to use PredefinedTextViewRoles.Document but when doing this the IDE would randomly throw Exceptions saying that the Editor wasn’t fully created yet. Which is odd since Visual Studio is doing all the initializations.

The set of values you can change this way are listed here: http://msdn.microsoft.com/en-us/library/microsoft.visualstudio.text.editor.ieditoroptions_members(v=VS.100).aspx

For more things you change take a look at http://msdn.microsoft.com/en-us/library/ee818135.aspx

Tamar's Blog

My adventures in extending Visual Studio and general development on the GHC Haskell Compiler

Category Archives: VSX