Pinpointing a Static GC Root with SOS

February 7, 2012

no comments

NOTE: if you’re not familiar with SOS (a WinDbg extension for managed code) and leak detection with !gcroot, start by reading an introductory post on the subject.

A typical root reference chain for a managed object that is retained by a static GC root would have a pinned object array appear as the rooted object. Here is a typical reference chain:

0:010> !gcroot 0000000002bcaf58

This object array is ubiquitous, it would seem that all static root references stem from it. Indeed (and this is a CLR implementation detail), static fields are stored in this array and their retention as far as the GC is concerned is through it.

This also makes it difficult to determine which static field of which class is responsible for the static reference. For example, in the reference chain above, it is apparent that there is a static EventHandler-typed field (which is likely an event) that retains the FileInformation instance – but it’s very desirable to find the details of that static field.

More than six years ago Doug Stewart wrote a short blog post outlining the general process in cases like these. This process generally works, but requires some adaptation in the 64-bit era, so here goes.

First, let’s take a look at that rooted array:

0:010> !do 0000000012761018
Name: System.Object[]
MethodTable: 000007fef68858f8
EEClass: 000007fef649eb78
Size: 8192(0x2000) bytes
Array: Rank 1, Number of elements 1020, Type CLASS
Element Type: System.Object

OK, so it’s an array with 1020 elements, and one of these elements must be our event handler. Is it the case? Let’s see:

0:010> s -q 0000000012761018 L2000 00000000039b3c30
00000000`12762e10  00000000`039b3c30 00000000`0278b380

Sure enough, our event handler is one of the array elements, at the address 00000000`12762e10. Now there are two key observations:

  1. The EventHandler instance ended up in the array somehow. Maybe if we can find other references to this array address, we can find who put it there and then determine whose static field it is.
  2. There is a reference from that EventHandler instance to one of our application’s objects (eventually). Then there should be additional references to this array address, which shape the chain of references to our application’s object.

Frankly, both of these are long shots, because it might be the case that the address is calculated dynamically, but let’s give it a spin. Doug’s original guidance at this point is to launch a memory search for any references to the array location, which would complete in a few seconds for a 32-bit address space; not so much for a 64-bit address space!

However, we are looking for references in managed code only, so no need to traverse the entire address space. It suffices to look at the address ranges of modules in the current AppDomain:

0:010> !dumpdomain
Domain 1: 0000000000c1c5f0
LowFrequencyHeap: 0000000000c1c638
HighFrequencyHeap: 0000000000c1c6c8
StubHeap: 0000000000c1c758
Stage: OPEN
SecurityDescriptor: 0000000000c1de90
Name: FileExplorer.exe
Assembly: 0000000000c3cd80 [C:\Windows\assembly\GAC_64\mscorlib\\mscorlib.dll]
ClassLoader: 0000000000c3ce40
SecurityDescriptor: 0000000000c3cc40
  Module Name
000007fef6461000 C:\Windows\assembly\GAC_64\mscorlib\\mscorlib.dll
000007ff000f2568 C:\Windows\assembly\GAC_64\mscorlib\\sortkey.nlp
000007ff000f2020 C:\Windows\assembly\GAC_64\mscorlib\\sorttbls.nlp
Assembly: 0000000000c57480 [D:\courses\NET Debugging\Exercises\4_MemoryLeak\Binaries\FileExplorer.exe]
ClassLoader: 0000000000c57540
SecurityDescriptor: 0000000000c57390
  Module Name
000007ff000433d0 D:\courses\NET Debugging\Exercises\4_MemoryLeak\Binaries\FileExplorer.exe
…many more of these guys…

Now we have a couple of module addresses and can constrain our memory search. It seems safe to start at 7ff`00000000 and go through a few hundred megabytes looking for our address. Generally speaking, the proper WinDbg command here would be:

0:010> s -q 000007ff`00000000 L?00000000`40000000 00000000`12762e10

(…we are looking for a full QWORD.) The problem is that we might miss unaligned references to that address, which may occur if it is hardcoded into some instruction (e.g. a MOV). So instead we should be looking for the individual byte sequence, and remember that we are on a little endian architecture:

0:010> s -b 000007ff`00000000 L?00000000`40000000 10 2e 76 12
000007ff`001913d3  10 2e 76 12 00 00 00 00-48 8b 00 48 89 44 24 60  ..v…..H..H.D$`
000007ff`00191440  10 2e 76 12 00 00 00 00-48 8b d0 e8 60 c1 87 f7  ..v…..H…`…

Voila! Two references to the array location, and now let’s take a look at them with the !u command to see if they are code:

0:010> !u 000007ff`001913d3
Normal JIT generated code
Begin 000007ff001912d0, size 18d
000007ff`001913d0 90              nop
000007ff`001913d1 48b8102e761200000000 mov rax,12762E10h
000007ff`0019143e 48b9102e761200000000 mov rcx,12762E10h
000007ff`00191448 488bd0          mov     rdx,rax

They are both a match inside FileInformation’s constructor, which gives us an excellent clue where to look. (The rest of the analysis is not shown here – you would look at the constructor’s source code and identify the event in question.)

This analysis process is rather tedious, but in the absence of a profiler capable of performing this analysis for you, it’s yet another useful skill to the memory leak detection toolkit.

I am posting short updates and links on Twitter as well as on this blog. You can follow me: @goldshtn
This blog post was also cross-posted to CodeProject.

Add comment
facebook linkedin twitter email

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>