Intellisense Part 1–Haskell Package Interface


As most of you who have been following this blog know I have IntelliSense and Cabal support left. I decided to focus on IntelliSense first (even though Cabal support is easier). So this is the first in a series of posts on how I’ve decided to implement IntelliSense.

[Sidenote: University has started again, So I’m afraid I’ll only have time to work on this project in the weekends, at least, when I’m coherent enough to Smile]

IntelliSense for those of you who don’t know is Microsoft’s implementation for Code Completion, a small overview can be found [here]. However the gist of it is that when the user starts typing in a relevant place that the IDE will try and help the user along by showing identifiers and or types currently in scope. To that extend Visual Haskell will support two types of scopes

  • Function scopes: e.g. whenever you’re inside a function, you’ll get a list of every bindings (both local and global) ,lambda variables and Modules in scope. Should you type a module name and a . you’ll get the other module names you can choose or functions you can use qualified from that module if any.
  • Type scopes: e.g. whenever you’re working inside a type signature, the list will limit itself to types that are currently in scope (along with modules again of course).

This is how I plan to implement code completion, If anyone has any requests of suggestions please let me know now since I can still change it for the initial release now.

In order to implement IntelliSense I need to index all the packages currently installed by GHC and also keep updating this as time goes by and you install new cabal packages. Visual Haskell will ship with a custom version of cabal-install ghc-pkg (and eventually a custom haddock as well in order to generate Visual Studio help files) so keeping them up to date should not be a problem.

I have still not decided how to store this information, But I’m leaning towards a structure with a Spatial Index , more specifically I’m leaning towards using a BANG file. I believe using this file will allow me to do the different kinds of lookups I need to do while having a memory mapped file.

But the first step is to get the information from ghc-pkg and ghc on your packages. These are then stored in a .hpi file (haskell package interface). Which is just a very simplified version of the .HI files ghc uses. They contain functions + documentation, classes declarations, instances and types. The reason for these files is two folds:

  • For the class browser we want to be able to browse packages (in a simplified manner) so these files will contain all we need for now, along with the location of the actual .hi file if we need it for more complex stuff later.
  • From these files I will generate the large IntelliSense database, this will not contain any information on classes etc. so we need a way to quickly get to  these. (especially for things like code snippets)

In any case, the first step is now completed, I can successfully generate .hpi files with all the content described above. It does this for my configuration, which contains

C:/ghc/ghc-6.12.1\lib\package.conf.d:
    Cabal-1.8.0.2
    Win32-2.2.0.1
    array-0.3.0.0
    base-3.0.3.2
    base-4.2.0.0
    bin-package-db-0.0.0.0
    bytestring-0.9.1.5
    containers-0.3.0.0
    directory-1.0.1.0
    (dph-base-0.4.0)
    (dph-par-0.4.0)
    (dph-prim-interface-0.4.0)
    (dph-prim-par-0.4.0)
    (dph-prim-seq-0.4.0)
    (dph-seq-0.4.0)
    extensible-exceptions-0.1.1.1
    ffi-1.0
    filepath-1.1.0.3
    ghc-6.12.1
    (ghc-binary-0.5.0.2)
    ghc-prim-0.2.0.0
    haskell98-1.0.1.1
    hpc-0.5.0.4
    integer-gmp-0.2.0.0
    old-locale-1.0.0.2
    old-time-1.0.0.3
    pretty-1.0.1.1
    process-1.0.1.2
    random-1.0.0.2
    rts-1.0
    syb-0.1.0.2
    template-haskell-2.4.0.0
    time-1.1.4
    utf8-string-0.3.4

C:\Users\Phyx\AppData\Roaming\ghc\i386-mingw32-6.12.1\package.conf.d:
    Cabal-1.9.2
    HTTP-4000.0.9
    Hs2lib-0.2.2
    MonadCatchIO-mtl-0.3.0.1
    QuickCheck-2.1.0.3
    ansi-terminal-0.5.3
    binary-0.5.0.2
    colorize-haskell-1.0.1
    cpphs-1.11
    deepseq-1.1.0.0
    fgl-5.4.2.2
    ghc-mtl-1.0.1.0
    ghc-paths-0.1.0.6
    ghc-syb-0.2.0.0
    haddock-2.7.2
    haskell-lexer-1.0
    haskell-src-1.0.1.3
    haskell-src-exts-1.8.2
    haskell-src-exts-1.9.0
    hint-0.3.2.3
    mtl-1.1.0.2
    network-2.2.1.7
    parallel-2.2.0.1
    parsec-2.1.0.1
    primitive-0.3
    tar-0.3.1.0
    uuagc-0.9.10
    uuagc-0.9.14
    uuagc-0.9.23
    uuagc-0.9.26
    uulib-0.9.10
    uulib-0.9.12
    vector-0.6.0.2
    zlib-0.5.2.0

In about 39.16seconds and swallowing about 500mb of ram to do so while maxing out a core. So users will most likely not notice this first step at all. A snap of what the internal of a .hpi file looks like is:

image

Advertisements

Configurable Candy


Candy is a feature where you replace a selection of text with something else (usually also text), however this is done in view only and so not in the actual file. This is useful to replace things like “->” with an actual Unicode arrow while still allowing other text editors that can’t handle Unicode to display the file correctly.

Leksah implements this and allows you to configure it via a so called “Candy” file, So I “borrowed” their approach and extended it to suit my needs.

The general syntax of a Visual Haskell candy file (.vshc) is

-- "<token>" <unicode> <modifier> <enabled> <FIT|NONE>
-- the token has to be quoted
-- the supported modifiers are
-- CODE    - Apply only to regions of code
-- COMMENT - Apply only inside comments
-- STRING  - Apply only in string literals
-- ALL     - Apply to all

the modifiers are self explanatory but the FIT or NONE modifiers take some explaining.

When using the FIT modifier, the Candy engine won’t try to keep the same width as the text it’s replacing. This means that you get a layout change. The actual file might have “alpha” but the view will show only “a”.

With some things, especially with keywords we don’t want this, this is where the NONE modifier comes in. When this is used the engine will always match the width of the text it’s replacing by making the Unicode text larger and adding horizontal whitespace. This means that “alpha” would be rendered as “  a  “ and so preserving the layout.

A shot of this in action can  is:

image

For reference, the full default candy file that will be shipping with VSH2010 is:

— Candy file

— Format

— "<token>" <unicode> <modifier> <enabled> <FIT|NONE>

— the token has to be quoted

— the supported modifiers are

— CODE    – Apply only to regions of code

— COMMENT – Apply only inside comments

— STRING  – Apply only in string literals

— ALL     – Apply to all

— Note that the replacement block will always take up the exact same

— space as the tokens it’s replacing. e.g. "alpha" will be replaced by "  a  "

"->"         0x2192    CODE       True       NONE     –RIGHTWARDS ARROW

"<-"         0x2190    CODE       True       NONE     –LEFTWARDS ARROW

"=>"        0x21d2    CODE       True       NONE     –RIGHTWARDS DOUBLE ARROW

">="        0x2265    CODE       False      NONE     –GREATER-THAN OR EQUAL TO

"<="        0x2264    CODE       False      NONE     –LESS-THAN OR EQUAL TO

"/="          0x2260    CODE       False      NONE     –NOT EQUAL TO

"&&"        0x2227    CODE       False      NONE     –LOGICAL AND

"||"           0x2228    CODE       False      NONE     –LOGICAL OR

"++"        0x2295    CODE       False      NONE     –CIRCLED PLUS

"::"           0x2237    CODE       False      NONE     –PROPORTION

".."           0x2025    CODE       False      NONE     –TWO DOT LEADER

"^"            0x2191    COMMENT    False      NONE     –UPWARDS ARROW

"=="        0x2261    CODE       False      NONE     –IDENTICAL TO

" . "          0x2218    CODE       True       NONE     –RING OPERATOR

"\\"           0x03bb    CODE       True       NONE     –GREEK SMALL LETTER LAMBDA

"=<<"       0x291e    CODE       False      NONE     —

">>="       0x21a0    CODE       False      NONE     —

"$"           0x25ca    CODE       False      NONE     —

">>"        0x226b    CODE       False      NONE     — MUCH GREATER THEN

"forall"    0x2200    CODE       False      NONE     –FOR ALL

"exist"     0x2203    CODE       False      NONE     –THERE EXISTS

"not"       0x00ac    CODE       False      NONE     –NOT SIGN

"alpha"         0x03b1    ALL        True       FIT      –ALPHA

"beta"           0x03b2    ALL         True       FIT      –BETA

"gamma"     0x03b3    ALL        True       FIT      –GAMMA

"delta"          0x03b4    ALL        True       FIT      –DELTA

"epsilon"     0x03b5    ALL        True       FIT      –EPSILON

"zeta"           0x03b6    ALL        True       FIT      –ZETA

"eta"             0x03b7    ALL        True       FIT      –ETA

"theta"          0x03b8    ALL        True       FIT      –THETA

— Because you can configure options inside the editor itself, don’t comment out

— lines since they won’t be parsed, just change the enable flag

QuickInfo


Visual studio has this ability to show information about symbols when you hover over them, this feature is called “QuickInfo”

This essentially means that you can hover over a symbol like “fmap” and it would tell you, fmap :: forall a b (f :: * -> *). (Functor f) => (a -> b) -> f a  -> f b and that it’s defined in GHC.Base

in ghci this would be equivalent to typing :i fmap which would result in the following output

class Functor f where
  fmap :: (a -> b) -> f a -> f b
  …
        — Defined in GHC.Base

Whenever the user hovers over a symbol in visual studio, the IDE will call a method

public void AugmentQuickInfoSession(IQuickInfoSession session, IList<object> qiContent, out ITrackingSpan applicableToSpan)

 

I use the information given to me to construct two things

  • The word the user is hovering on
  • The exact location within the source file of that word

This information is used to find the correct Name value in the Haskell Renamed AST. The problem is we can’t construct name values, so we have to look them up. This is provided with the help of a typeclass

class Finder a where
    findName     :: MonadPlus m => a -> FastString -> Maybe SrcSpan -> m Name

The monad used determines how many results you receive. Use a Maybe monad and you’ll get just 1. use a List monad and you’ll get more than one, but only if you don’t specify a specific source span to look for (wildcard match on name alone).

However we should never enter the PostTcType types inside the renamed AST. These are invalid at this stage. Unfortunately SYB’s listify does not provide a way to tell it not to enter a specific type.

So we create a modified version of those SYB calls:

data Guard where
  Guard :: Typeable a => Maybe a -> Guard
 
type HList = [Guard]

— | Summarise all nodes in top-down, left-to-right order
everythingBut :: (r -> r -> r) -> HList -> GenericQ r -> GenericQ r
everythingBut k q f x
  = foldl k (f x) fsp
    where fsp = case isPost x q of
                  True  -> []
                  False -> gmapQ (everythingBut k q f) x

isPost :: Typeable a => a -> HList -> Bool
isPost a = or . map check
where check :: Guard -> Bool
       check x = case x of
                   Guard y -> isJust $ (cast a) `asTypeOf` y

— | Get a list of all entities that meet a predicate
listifyBut :: Typeable r => (r -> Bool) -> HList -> GenericQ [r]
listifyBut p q
  = everythingBut (++) q ([] `mkQ` (\x -> if p x then [x] else []))

Now listify takes a HList of types not to inspect. HList is a Heterogeneous list, so it’ll allow things of different types inside it. Finding the Name is now as simple as:

instance Finder (HsGroup Name) where
    findName grp a b = findName (listifyBut (isName a b) [Guard (undefined :: Maybe PostTcType)] grp) a b
 

once we have the names, we can just call getInfo. Nothing else is needed because remember that all API calls have a Context as argument, for instance the full type of the tooltip function is:

— @@ Export
— | Gather information about the identifier you requested
–   .
–   Context: The session for this call, Serves as a cache
–   .
–   String : The name of the identifier to lookup
–   .
–   SrcSpan: The location of the identifier in the sourcefile
–   .
–   Bool   : Whether to treat this call as a strict one. If it’s strict
–            Then the name AND span must match. If it’s not, Any match will do
–   .    
getTooltip :: Context -> String -> SrcSpan -> Bool -> IO (Maybe String)

This produces the following result

image

There’s a problem however, if you hover over a variable name that’s defined in the body of the function it produces a runtime panic:

image

If you think about it, this kind of makes sense, GHCi also won’t produce anything on local variables. In fact you can’t even refer to them. But we would at the least we would like to prevent this crash, and in the best case scenario we would like *some* information on the symbol.

After poking around some I noticed that the type of the identifiers that produce the errors are “Internal Name” values. the function nameModule then fails on these types. The plan now is, whenever we find a Internal Name, we look into the TypecheckedModule to find the Id associated with the Name value we retrieved earlier. with SYB this is again easy. However there’s a catch (thanks to nominolo for pointing this out): we should not enter any PostTcKind nor NameSet because these are blank after type checking.

findID :: Data a => a -> Name -> [Id]
findID a n = listifyBut ((n==) . getName) [Guard (undefined :: Maybe NameSet)
                                                                       ,Guard (undefined :: Maybe PostTcKind)] a

and that’s all. The end result is that this now works on local variables as well. Hovering over for instance the variable file generates

image

The important thing to note here is the Context , it’ll contain a cache of information. So looking up any of this stuff will be instantaneous. You just hover and directly get back information.

A last cool but *I’m not so sure how useful* function is that if you select something, then hover over it, it will type check only that expression.

image

so if you have an expression "fmap foo” somewhere but don’t remember what type foo or fmap is, just select them and hover over the selection. (although this is somewhat limited, all identifiers have to be top-level. It can’t return anything for local variables. sorry Sad smile )

And that’s it for this post, I’ll continue the work on Cabal now, or continue this track and fully finish intellisense.

No video for this in action, since I have a cold Confused smile

A new video


This is just an intermediate update, showing the near final UI. Cabal support is what I’m working on now and that is about 20% finished. I hope to get that done in a week or so barring any more difficulties.

If I do get cabal support I can put out a beta (without intellisense) just so I can get some feedback.

the video link is http://screencast.com/t/ZTBhMmUxNz

Pardon the background noise, I’m currently in a lot of wind.

screenshot

Added a screenshot for good measure, Click on the image for fullscreen.

Ghost typing


This is the preliminary version of the Ghost typing addition to visual Haskell, the idea is that whenever an explicit type signature is not given, the  IDE will display the type inferred by GHC.

You can then click on the signature to insert it, or use the smart action associated with the name of the function.

Up next is the feature that when you have given a signature that doesn’t type check, the IDE will remove that signature and retry, if it succeeds the IDE will display a suggested signature.

Below is a GIF of how the first part works.

ghostyping

Oh and collapsible regions has been finished as well Smile 

if the function has a type signature it will collapse at the end of that declaration, if not it’ll collapse at the end of the function name.

There is a restriction to this however since GHC allows you to declare your signatures anywhere in the file. In order for the signature to be considered part of the function by collapsible regions it has to be end on the line before the binding.

Which means it can span multiple lines, the end just has to be before the binding, that way it also supports haddock documented type signatures.

Collapsible regions.–Video Update #2


So here’s the second video update, it shows some of the current progress up till now, while it might not look significantly different but there’s a big difference under the hood. A lot has been rewritten and optimized in anticipations of new features coming soon, like intellisense and ghost typing (coming in the next video, due in a few days). Click the link below to see the video

Collapsible Regions

That’s it for today, and now… back to my thesis project.

Another screen capture


Just a quick snapshot of what I’ve been working on, Still doing work on typchecking and code discovery which is a bit slow atm, and having to finish this is taking longer than I expected.

errors

But then it’s off to cabal support after this.