DCSIMG
January 2011 - Posts - All Your Base Are Belong To Us

All Your Base Are Belong To Us

Mostly .NET internals and other kinds of gory details

January 2011 - Posts

CLR 4 Does Not Use LoadLibrary to Load Assemblies

You may be asking yourself: …who cares? Well, first of all it’s good to know. I haven’t noticed a public service announcement to the above. It is an implementation detail, however—CLR assemblies are not even guaranteed to be implemented using files, not to mention DLL files in a specific format that are loaded using the LoadLibrary Win32 API.

However, there are several tools and scenarios which have come to rely on the fact the CLR loads assemblies using LoadLibrary. For example, up to CLR 4, if you wanted to know which .NET assemblies were loaded in your process, a fairly reliable heuristic would be to fire up Sysinternals Process Explorer and look at the DLLs view of a given process. This doesn’t work for CLR 4, as you can see here:

image 
Note that although the .NET Assemblies tab shows clearly an assembly called “TheClassLibrary”, it’s not present in the DLLs list.

I wouldn’t be so concerned with not seeing a DLL in the modules list if there weren’t additional utilities relying on an assembly being “loaded” in the Windows loader sense. One of them has to do with debugging symbols.

SOS and similar tools need debugging symbols for managed code to display source file and line level information. For example, if you use the !CLRStack command to display the managed stack of a certain thread, you expect source-and-line information but it requires PDB files with debugging information for the binary whose methods appear on the stack.

What’s the problem? The problem is that if the .NET assemblies (DLLs) are not actually loaded into the process, the debugger does not load symbols for them. Without symbols, debugger extensions like SOS or SOSEX can’t use the debugging information, so here’s the kind of stack you would get:

0:000> !mk
     ESP      RetAddr
00:U 0043ef1c 75d473ea KERNEL32!ReadConsoleInternal+0x15
01:U 0043ef24 75d47041 KERNEL32!ReadConsoleA+0x40
02:U 0043efac 75ccf489 KERNEL32!ReadFileImplementation+0x75
03:U 0043eff4 0fa8cc7c MSVCR100D!_read_nolock+0x62c [f:\dd\vctools\crt_bld\self_x86\crt\src\read.c @ 230]
04:U 0043f088 0fa8c5c9 MSVCR100D!_read+0x219 [f:\dd\vctools\crt_bld\self_x86\crt\src\read.c @ 92]
05:U 0043f0d8 0f9e1093 MSVCR100D!_filbuf+0x113 [f:\dd\vctools\crt_bld\self_x86\crt\src\_filbuf.c @ 136]
06:U 0043f100 0f9df5ab MSVCR100D!getc+0x20b [f:\dd\vctools\crt_bld\self_x86\crt\src\fgetc.c @ 75]
07:U 0043f15c 0f9df660 MSVCR100D!_fgetchar+0x10 [f:\dd\vctools\crt_bld\self_x86\crt\src\fgetchar.c @ 37]
08:U 0043f168 0f9df67a MSVCR100D!getchar+0xa [f:\dd\vctools\crt_bld\self_x86\crt\src\fgetchar.c @ 47]
09:U 0043f170 0f8f1396 TheUnmanagedDll!fnTheUnmanagedDll+0x26 [c:\temp\theunmanageddll.cpp @ 12]
…snipped…
0b:M 0043f2a8 002900c7 TheClassLibrary.TheClass.TheMethod()(+0x1 IL)(+0x6 Native)
0c:M 0043f2b0 0029008c AssemblyLoadingChange.Program.Main(System.String[])(+0x1 IL)(+0x7 Native) [c:\temp\Program.cs @ 12]
0d:U 0043f2bc 62ea21db clr!CallDescrWorker+0x33
…snipped for brevity…

Note the bold lines—there are three frames here: frame 09 is an unmanaged frame for which we have symbols, so we get line-level information. Frame 0c is a managed frame that lies in an .exe file, which is always loaded so there are symbols loaded for it as well. But frame 0b is a managed frame that lies in a managed assembly (DLL)—it’s not loaded, so there are no symbols and no line-level information available.

I’m sure these aren’t the only two examples where you would want .NET assemblies to be loaded into the process in the Windows loader sense. What are yours?

ANTS Memory Profiler Review

Diagnosing memory leaks by taking multiple dumps, analyzing them with SOS commands like !DumpHeap and !GCRoot, and maybe exporting a heap graph to CLR Profiler with !TraverseHeap is a very ungrateful experience albeit one I had to go through many many times.

If you’re doing postmortem diagnostics or can’t possibly afford to do live work on the problematic machine, there’s no other choice but to stick to dumps. Fortunately, in some cases you can afford to reproduce a memory leak locally and in a live scenario. In these cases, a memory profiler is an invaluable tool and can lead to much faster memory leak diagnostics without having to type and process text-mode commands. [This is not to imply that SOS skills aren’t important. The most important thing, though, is to choose the right tool for the task.]

The ANTS Memory Profiler is a profiler focused at diagnosing memory leaks and understanding memory usage in .NET applications. Its current version (v6.0) is capable of working with standard .NET executables, Windows services, ASP.NET apps and even Silverlight 4 loaded into the browser. On supported OS versions it can attach to a running process (this requires at least Windows Vista and .NET 4).

Full disclosure: I was given a complimentary copy of the profiler in exchange for this (objective! :-)) review.

There are some tutorials by RedGate that demonstrate the profiler’s capabilities, so I’ll restrict myself to an example from SELA’s .NET Debugging course. [If you’re looking for a feature comparison chart and CNET-style reviews, this isn’t what I had in mind.]

The following is a memory leak exhibited by a GUI application—when switching between directories (on the left), memory usage seems to climb significantly and never goes down:

image

The profiler’s operation mode is roughly the following:

  1. Run the application and capture a snapshot of its memory usage before exercising the leaking scenario
  2. Exercise the leaking scenario
  3. Capture additional snapshots during and after the scenario’s execution
  4. Analyze by comparing snapshots and seeing which objects are added between snapshots and are not being freed

Basically, all you have to do is run the app and remember to click “Take Memory Snapshot” in the profiler’s UI every once in a while. [I’m told that an upcoming version of the profiler will feature automatic snapshots that you can trigger from code.]

Here’s the summary view that compares two snapshots of the application—one before navigating through some folders, and one after:

image

Most of the new memory is occupied by strings, and there’s a large number of them. Starting your analysis from strings is a risky endeavor in SOS—you can quickly be paging through thousands of objects with little hope of telling the forest from the trees. With ANTS it’s simpler—just click the class you want and choose “Class Reference Explorer” from the top of the view and then click away through the reference graph:

image

This reaffirms the suspicion that the FileInformation objects have something to do with the leak. So let’s focus on these objects and see what we get:

image

Some of these objects (a mere 29%) are referenced by a listbox—that’s most probably the listbox on the right of the UI. But all of them—100%—are referenced by EventHandlers. This should immediately light up in your brain as a potential leak source, and the profiler also has an object filter for “Kept in memory only by event handlers” that I could use.

What is this event handler? We could tell by looking at one of the instances of FileInformation, through the profiler’s “Object Retention Graph”:

image

Now that’s really neat. You get the full path from the GC root—in this case a static event—to your object of interest.

This concludes the leak analysis: the strings comprise most of the memory leak, but they are held in memory by instances of FileInformation, which in turn are not reclaimed because they are registered to a static event (and probably the code that’s supposed to unregister them isn’t called, or worse—doesn’t exist).

Writing a Compiler in C#: C Code Generation, Part 1

After having discussed in some detail the lexical analysis and parsing phases, it’s time to get our hands dirty with actual code generation. Theoretically speaking, our parser emits an intermediate representation of the parsed program—the code-generator interface, shown below, can be used to construct an actual tree depicting the structure of the program.

For the practical purpose of translating a Jack program to C or assembly language, there’s no need to maintain in memory a real parse tree. By using the symbol state and a small set of auxiliary data structures, we can implement a code generator that emits legal C code. This C code can be compiled by the C compiler to obtain an executable program. (If the idea of compiling Jack to C and then relying on the C compiler seems like cheating to you, consult the history of the C++ programming language. The very first C++ compiler, Cfront by Bjarne Stroustrup, converted C++ to C.)

Let’s take a look at the interface the code generator has to implement. (Obviously, there are some convenience-based choices here, making it easy to develop the code generator specifically for Jack.)

internal interface ICodeGenerator
{

    void SetOptions(CodeGeneratorOptions options);
    void InitSymbolTables(
SymbolTable classSymTable,
SymbolTable methodSymTable);
    void EmitEnvironment();
    void EmitBootstrapper();

    void StaticDeclaration(Symbol variable);
    void FieldDeclaration(Symbol variable);
    void ConstructorDeclaration(Subroutine subroutine);
    void FunctionDeclaration(Subroutine subroutine);
    void MethodDeclaration(Subroutine subroutine);
    void EndSubroutine();
    void Return();
    void BeginWhile();
    void WhileCondition();
    void EndWhile();
    void BeginIf();
    void PossibleElse();
    void EndIf();
    void Assignment(Token varName, bool withArrayIndex);
    void Add();
    void Sub();
    void Mul();
    void Div();
    void Mod();
    void And();
    void Or();
    void Less();
    void Greater();
    void Equal();
    void LessOrEqual();
    void GreaterOrEqual();
    void NotEqual();
    void IntConst(int value);
    void StrConst(string value);
    void True();
    void False();
    void Null();
    void This();
    void Negate();
    void Not();
    void VariableRead(Token varName, bool withArrayIndex);
    void Call(string className, string subroutineName);
    void DiscardReturnValueFromLastCall();
    void BeginClass(string className);
    void EndClass();
}

If you recall the Jack expression parser we developed a few installments ago, it effectively converts a Jack expression into postfix form by calling the code generator in a post-order traversal of the parse tree. For example, if x and y are terminals (e.g. local variables), then the parser processes the expression x + y by performing the following calls to the code generator:

VariableRead(x, withArrayIndex:false)
VariableRead(y, withArrayIndex:false)
Add()

Another example to drive the point home—the expression true|this==null&(x[3]-y)<13 results in the following calls, indented for easier understanding:

True()
  This()
  Null()
  Equal()
          IntConst(3)
        VariableRead(x, withArrayIndex:true)
        VariableRead(y, withArrayIndex:false)
      Sub()
      IntConst(13)
    Less()
  And()
Or()

Evaluating this expression at runtime is best modeled by a stack. Operations like IntConst, VariableRead, This push a value onto the stack; operations like Sub, Less, And pop two operands off the stack and push the result of the operation onto the stack; and so on.

When we’ll develop the x86 assembly language code generator for Jack, we’ll use the assembly PUSH and POP instructions to manipulate the thread’s explicit stack. To emulate a similar process for C code generation, we’ll use a global array of words, called __STACK, and emit a couple of intrinsic macros, __PUSH and __POP, that manipulate the stack.

Assuming that the local variables x and y in the Jack program are represented by the local variables x and y in the resulting C function, compiling x + y to C boils down to:

__PUSH(x);
__PUSH(y);
__SCRATCH1 = __POP();
__PUSH(__POP() + __SCRATCH1);

(Note that I’m using __SCRATCH1 as a scratch register—it’s simply a global variable of type __WORD, the only type we’ll be dealing with unless we want to address type checking.)

This is really wasteful, you say—we could compile the whole thing to __PUSH(x+y) and save a bunch of instructions! That’s true, but we’re going to leave optimization off the table for now, especially considering that we’re using the C compiler in the next phase—and it’s already pretty good at optimizing things. (Again, if the idea of not doing optimization at the intermediate compilation phase strikes you as lazy, consider what the C# compiler does—it emits IL instructions that manipulate the IL evaluation stack, and leaves it up to the JIT to decide whether some stack operations can be optimized or even replaced by register manipulation altogether.)

With that said, we’re ready for the part of the code generator that deals with expressions and the assignment statement—let. (Control constructs, subroutines, and top-level program structure will be dealt with next.)

There are a couple of translation decisions that you need to be aware of prior to reading this code:

  • __WORD is the single type we use for everything. In this version of the compiler, it’s simply int.
  • A Jack class is mapped to a C struct.
  • Jack class fields are mapped to C struct fields.
  • Jack class statics are mapped to C global variables.
  • Jack subroutines are mapped to C functions.
  • Jack subroutines that return void are mapped to C functions that return __WORD. This return value is 0 by convention, and is ignored by the caller.
  • Within a method or constructor, THIS is a local variable that holds the value of this.
public override void Assignment(
Token varName, bool withArrayIndex)
{
    if (MethodSymTable.HasSymbol(varName.Value))
    {
        Symbol symbol =
MethodSymTable.GetSymbol(varName.Value);

        Output.WriteLine(
"__PUSH((__WORD)&{0});", symbol.Name);
    }
    else
    {
        Symbol symbol =
ClassSymTable.GetSymbol(varName.Value);

        if (symbol.Kind == SymbolKind.Static)
        {
            Output.WriteLine("__PUSH((__WORD)&{0});",
FormatStaticName(symbol.Name));
        }
        else if (symbol.Kind == SymbolKind.Field)
        {
            Output.WriteLine(
"__PUSH((__WORD)&((({0}*)THIS)->{1}));",
_currentClassName, symbol.Name);
        }
    }

    //If it's an array, obtain the address of the right
//element by
adding the index which is on the stack.
//We need a scratch
location because issuing two __POP()
//calls in the same statement
does not guarantee
//
left-to-right evaluation.
    if (withArrayIndex)
    {
        //The array address is now on the stack, but we
//really need the
address of the first element.
//
Hence the dereference:
        Output.WriteLine("__SCRATCH2 = *(__WORD*)__POP();");
        //This is the RHS value that we ought to put in
//the array element:

        Output.WriteLine("__SCRATCH1 = __POP();");
        //Finally, the top of the stack contains the value
//of the array
indexing expression, i.e. the
//element index:

        Output.WriteLine(
"*(((__WORD*)__SCRATCH2) + __POP()) = __SCRATCH1;");
    }
    else
    {
        Output.WriteLine(
"__SCRATCH1 = __POP();"); //This is the LHS
        Output.WriteLine(
"*((__WORD*)__SCRATCH1) = __POP();");
    }
}

public override void Add()
{
    Output.WriteLine("__SCRATCH1 = __POP();");
    Output.WriteLine("__PUSH(__SCRATCH1 + __POP());");
}
public override void Sub()
{
    Output.WriteLine("__SCRATCH1 = __POP();");
    Output.WriteLine("__PUSH(__POP() - __SCRATCH1);");
}
public override void Mul()
{
    Output.WriteLine("__SCRATCH1 = __POP();");
    Output.WriteLine("__PUSH(__POP() * __SCRATCH1);");
}
public override void Div()
{
    Output.WriteLine("__SCRATCH1 = __POP();");
    Output.WriteLine("__PUSH(__POP() / __SCRATCH1);");
}

public override void Mod()
{
    Output.WriteLine("__SCRATCH1 = __POP();");
    Output.WriteLine("__PUSH(__POP() % __SCRATCH1);");           
}
public override void And()
{
    Output.WriteLine("__SCRATCH1 = __POP();");
    Output.WriteLine("__PUSH(__POP() & __SCRATCH1);");
}

public override void Or()
{
    Output.WriteLine("__SCRATCH1 = __POP();");
    Output.WriteLine("__PUSH(__POP() | __SCRATCH1);");
}
public override void Less()
{
    Output.WriteLine("__SCRATCH1 = __POP();");
    Output.WriteLine(
"__PUSH(__POP() < __SCRATCH1 ? -1 : 0);");
}

public override void Greater()
{
    Output.WriteLine("__SCRATCH1 = __POP();");
    Output.WriteLine(
"__PUSH(__POP() > __SCRATCH1 ? -1 : 0);");
}

public override void Equal()
{
    Output.WriteLine("__SCRATCH1 = __POP();");
    Output.WriteLine(
"__PUSH(__POP() == __SCRATCH1 ? -1 : 0);");
}

public override void LessOrEqual()
{
    Output.WriteLine("__SCRATCH1 = __POP();");
    Output.WriteLine(
"__PUSH(__POP() <= __SCRATCH1 ? -1 : 0);");
}

public override void GreaterOrEqual()
{
    Output.WriteLine("__SCRATCH1 = __POP();");
    Output.WriteLine(
"__PUSH(__POP() >= __SCRATCH1 ? -1 : 0);");
}

public override void NotEqual()
{
    Output.WriteLine("__SCRATCH1 = __POP();");
    Output.WriteLine(
"__PUSH(__POP() != __SCRATCH1 ? -1 : 0);");
}

public override void IntConst(int value)
{
    Output.WriteLine("__PUSH({0});", value);
}

public override void StrConst(string value)
{
    Output.WriteLine("__PUSH((__WORD)\"{0}\");", value);
}

public override void True()
{
    IntConst(-1);
}
public override void False()
{
    IntConst(0);
}

public override void Null()
{
    IntConst(0);
}
public override void This()
{
    Output.WriteLine("__PUSH(THIS);");
}

public override void Negate()
{
    Output.WriteLine("__PUSH(-__POP());");
}

public override void Not()
{
    Output.WriteLine("__PUSH(~__POP());");
}

public override void VariableRead(
Token varName, bool withArrayIndex)
{
    //Put the value of the variable on the top of
//the stack. If it's
an array, the value is the
//address of the array's first element.

    if (MethodSymTable.HasSymbol(varName.Value))
    {
        Symbol symbol =
MethodSymTable.GetSymbol(varName.Value);
        Output.WriteLine("__PUSH({0});", symbol.Name);
    }
    else
    {
        Symbol symbol =
ClassSymTable.GetSymbol(varName.Value);
        if (symbol.Kind == SymbolKind.Static)
        {
            Output.WriteLine("__PUSH({0});",
FormatStaticName(symbol.Name));
        }
        else if (symbol.Kind == SymbolKind.Field)
        {
            Output.WriteLine(
"__PUSH((({0}*)THIS)->{1});",
_currentClassName, symbol.Name);
        }
    }
    //If it's an array, dereference it using [].
    if (withArrayIndex)
    {
        Output.WriteLine("__SCRATCH1 = __POP();");
        Output.WriteLine(
"__PUSH( ((__WORD*)__SCRATCH1)[ __POP() ] );");
    }
}

As always, most of the code (other than the part that deals with variables and assignments) writes itself. This gives us the foundation upon which we can build control statements and the general program structure, up next.