Debugging Silverlight Applications: SOS.DLL

August 25, 2008

no comments

I’ve written about SOS.DLL before – it is the ultimate Swiss-army knife for debugging managed applications.  Memory leaks, unexpected crashes, high-CPU scenarios, hangs and deadlocks can all be pinpointed using SOS.

Silverlight 2.0 offers a model for managed code execution inside the browser, and that managed code will require the same debugging capabilities needed for standard managed applications.  The Silverlight CLR (coreclr.dll) is a subset of the desktop CLR, but fortunately it ships with a version of SOS that can be used to debug Silverlight applications in the browser.

This special SOS is available as part of the Silverlight Developer Runtime (Windows version), and is installed by default to C:\Program Files\Microsoft Silverlight\<Version> (currently 2.0.30523.8).

To demonstrate some of the capabilities, I’ve written a trivial Silverlight application with three buttons, one for each debugging scenario:

image

Let’s click the “Leak Memory” button a few times.  Now we can take a dump of or attach WinDbg to the web browser process, exactly like any other process.  (I’m deliberately saying “web browser process” and not “Internet Explorer”, because every single one of these demos can be reproduced with Firefox, for example.)

Once the debugger is attached, we can load the Silverlight SOS into the process and execute the standard commands for diagnosing a memory leak:

0:014> !dumpheap -stat
total 11445 objects
Statistics:
      MT    Count    TotalSize Class Name
<snipped for clarity>
03845670      228        25644 System.Object[]
03845f9c      913        46164 System.String
03ad3ad4     5855        70260 System.RuntimeTypeHandle
03ade400       17     13632144 System.Byte[]
Total 11445 objects

OK, so we have large byte arrays.  Who is holding a reference to them?

0:014> .foreach (obj {!dumpheap -type System.Byte[] -short}) {!gcroot obj}
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 4 OSTHread d0
Scan Thread 12 OSTHread d40
Scan Thread 13 OSTHread eb4
<snipped for clarity>
  04e10f34(DebuggingSilverlight.Page)->
  04e10f84(…List`1[[System.Byte[], mscorlib]])->
  04e20f10(System.Byte[][])->
  06505990(System.Byte[])
<snipped for clarity>
  04e10f34(DebuggingSilverlight.Page)->
  04e10f84(…List`1[[System.Byte[], mscorlib]])->
  04e20f10(System.Byte[][])->
  06a05a30(System.Byte[])

We can conclude that our byte arrays are being held by the main Silverlight page, through a list of byte arrays.  We can detect the name of the list field by inspecting the page object (04e10f34) and looking for the address of the list (04e10f84):

0:014> !do 04e10f34
Name: DebuggingSilverlight.Page
MethodTable: 03b137dc
EEClass: 03b11954
Size: 80(0x50) bytes
(DebuggingSilverlight, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null)
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
<snipped for clarity>
0488ad68  4000009       44 …yte[], mscorlib]]  0 instance 04e10f84 leakSource

Well, there we have it – there’s a field called leakSource (which is an instance of List<byte[]>) which has a reference to all the leaking byte arrays.

Let’s try another scenario.  Clicking the “Crash” button causes the Silverlight application to throw an unhandled exception.  When running without a debugger attached, this produces the “Error on page” icon and in the details dialog we can see the exception information:

image

When a debugger is attached, this causes a first-chance exception in the debugger, and gives us the opportunity to inspect the exception information:

(e54.d0): Access violation – code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
<snipped for clarity>

We can examine the thread stack and see exactly where the exception is coming from:

0:004> !CLRStack
OS Thread Id: 0xd0 (4)
ESP       EIP    
0350f3a0 023204e7 DebuggingSilverlight.Page.ButtonCrash_Click(…)
0350f3c8 0491cced System.Windows.Controls.Primitives.ButtonBase.OnClick()
0350f3e0 0491cbf0 System.Windows.Controls.Button.OnClick()
<snipped for clarity>

All the standard SOS tricks apply – we can obtain the IL of the crashing method, view annotated disassembly, dump stack objects etc.

The final scenario (with the “Hang” button) causes the browser to stop responding.  We can break in with a debugger and inspect thread stacks to find the problematic stack:

0:004> !CLRStack
OS Thread Id: 0xd0 (4)
ESP       EIP    
0350f35c 76e59184 [HelperMethodFrame: 0350f35c] System.Threading.Thread.SleepInternal(Int32)
0350f3a8 02261fef System.Threading.Thread.Sleep(Int32)
0350f3b4 033e04cc DebuggingSilverlight.Page.ButtonHang_Click(…)
0350f3c8 0491cced System.Windows.Controls.Primitives.ButtonBase.OnClick()
0350f3e0 0491cbf0 System.Windows.Controls.Button.OnClick()
<snipped for clarity>

Well, someone is calling Thread.Sleep which causes the hang.  What’s the parameter?  We can do the usual kb to see the parameter:

0:004> kb
ChildEBP RetAddr  Args to Child             
<snipped for clarity>
0350f2a4 6700a753 ffffffff … kernel32!SleepEx+0x52

Well, it’s an infinite timeout (0xffffffff).  The browser is not going to become responsive anytime soon.

What’s even more interesting is that the SOS included in the Silverlight Developer Runtime has more commands than the standard SOS bundled with the desktop CLR!  But that’s best reserved for a future post.

Add comment
facebook linkedin twitter email

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*