Browse by Tags
All Tags »
Compiler (
RSS)
How many times have you seen in code reviews a piece of code that calls a method, say Dictionary<K,V>.TryGetValue , and ignores the return value? We are going on a quest to find all such invocations and produce a warning. We’re going to derive from SyntaxWalker (and not SyntaxRewriter ), because we won’t be doing any rewriting—just issue detection * . There are two major cases we need to consider: The method is invoked without storing its result in a local variable or using it as part of an...
Last time around , we were replacing the 42 numeric literal with 43. This time let’s pretend to do something more useful. Suppose you really don’t like developers calling the Console.Write method and insist on using Console.WriteLine instead. You might be slightly reluctant to use find-and-replace, because—just like last time—you don’t want to modify Console.Write calls within comments, within string literals, or—and this is vicious—calls to the Console.Write method on something that is not the System...
To start doing something useful with Roslyn , we’re going to inspect a syntax tree, locate something interesting—and then modify it! The complex structure of a C# program’s syntax tree ( SyntaxTree class) is exposed through a fairly intuitive object model, featuring three types of entities: Nodes are the major elements of the language; for example, an IfStatementSyntax is a node representing an “if” statement and a LiteralExpressionSyntax is a node representing a literal expression. Tokens are secondary...
The Roslyn project is the Microsoft implementation of C# and VB compilers-as-a-service. Roslyn provides a transparent view into the inner workings of the compiler, including syntax tree inspection and modification. An initial CTP of Roslyn has been released for download a couple of days ago—it requires Visual Studio 2010 SP1 and the VS2010 SP1 SDK . original image , license: CC BY-NC 2.0 Some of the scenarios enabled by Roslyn are the following: Refactoring . Refactoring tools no longer need to parse...
We’re ready to deal with control statements and top-level program structure. Let’s tackle these one after the other. The let statement has already been handled as part of the Assignment method in the previous installment. The while statement in Jack requires evaluating an expression and deciding whether to continue or to jump to the end of the loop. At the end of the while statement block, there should be an unconditional jump to the beginning of the loop. Something along the following lines: BEGINWHILE_0...
After having discussed in some detail the lexical analysis and parsing phases, it’s time to get our hands dirty with actual code generation. Theoretically speaking, our parser emits an intermediate representation of the parsed program—the code-generator interface, shown below, can be used to construct an actual tree depicting the structure of the program. For the practical purpose of translating a Jack program to C or assembly language, there’s no need to maintain in memory a real parse tree. By...
That’s it. We’re ready for the full BNF of the Jack grammar, followed by the top-down parser of a complete Jack program. Here goes: class ::= class cls-name { cls-var-decl * sub-decl * } cls-var-decl ::= ( static | field ) type var-name ( , var-name )* ; type ::= int | char | boolean | cls-name sub-decl ...
Last time we left off on the brink of finishing the parser for Jack expressions. We need only fill in the blanks for parsing subroutine calls. There are three forms of subroutine calls allowed in Jack: class C { constructor C new() { return this; } function void f() { var C c; var D d; var int i; ...
Before we proceed to the full BNF of a Jack expression, we need to decide which operators we’re going to support. Our final implementation will have some additional operators, but for now we’ll settle for +, –, *, /, <, >, =, &, |, and !. One obvious question when dealing with arithmetic and relational operators is the question of operator precedence. What’s the value of 5+3*2? Is it 16 or 11? What’s the value of 3<2+5? Is it 6, or 1, or something else entirely depending on the integer...
In the previous installment we saw the core of a lexical analyzer, a module that generates from a stream of characters a set of tokens for symbols, identifiers, keywords, integer constants, and string constants. Today, we move to parsing. The parser’s job is to give semantic structure to the syntactic tokens bestowed upon it by the lexical analyzer. There are, as always, automatic tools like yacc that create from a BNF grammar a program that parses tokens in a certain language. However, it is often...
I’m going to write a compiler for a simple language. The compiler will be written in C#, and will have multiple back ends. The first back end will compile the source code to C, and use cl.exe (the Visual C++ compiler) to produce an executable binary. But first, a minor digression. Over my blogging years, I developed this tendency of abandoning blog post series just prior to their final installment. I abandoned the unit testing series , the primality testing series , and many other “series”. Therefore...