DCSIMG
April 2011 - Posts - All Your Base Are Belong To Us

All Your Base Are Belong To Us

Mostly .NET internals and other kinds of gory details

April 2011 - Posts

DBX vs. Visual Studio and WinDbg: Part 2C, Memory Access Breakpoints, Revisited

This is where we are through the series:


Today’s post revisits the concept of memory access breakpoints. Before I started this series, I wrote a post showing a proof of concept technique for setting a memory breakpoint on a wide range of memory using SDbgExt’s !vprotect command, PAGE_GUARD pages, and WinDbg. However, there’s more to be said about memory access breakpoints than meets the eye.

As I have mentioned in the previous post, hardware breakpoints on Intel x86/x64 CPUs can be configured to fire on reads or writes to a specific memory location, whose size cannot exceed the size of the system word*. DBX, on the other hand, allows access breakpoints on memory ranges of arbitrary size. Furthermore, DBX allows you to configure the breakpoint to fire before or after the memory access has taken place, as opposed to hardware breakpoints that are always fired after the event has occurred.

Below is an example of using DBX’s memory access breakpoints:

(dbx) list
   28     getchar();
   29     ms2.next = &ms2;
   30     getchar();
   31     ms.y = 14;
   32     ms1.y = 13;
   33     getchar();
(dbx) stop access wb &ms
(3) stop access wb &ms, 12
(dbx) cont
watchpoint wb &ms (0x8068f04[12]) at line 31 in file "stl.cc"
   31     ms.y = 14;
(dbx) print ms.y
ms.y = 0

...

(dbx) list
   26     getchar();
   27     a.arr[678] = 'a';
   28     getchar();
(dbx) stop access w &a
(3) stop access wa &a, 1000
(dbx) cont
watchpoint wa &a (0x8047846[1000]) at line 27 in file "stl.cc"
   27     a.arr[678] = 'a';
(dbx) print a.arr[678]
a.arr[678] = 'a'

There are two things worth noting here: first, there is no need to specify the size of anything—DBX deduces size information from the variable itself; second, the memory breakpoint indeed works for a range that is larger than what hardware breakpoints allow.

There’s yet another feature of DBX that has to do with watching memory locations. You can use the stop change myVar command to watch for any change in the value of a particular variable. Unlike memory breakpoints, this feature relies on automatically single-stepping through your entire program, checking the variable’s value after each step, and therefore considered expensive.

Memory Breakpoints in the Managed World
Both DBX and Visual Studio have support for debugging managed languages—in DBX it’s Java, in Visual Studio it’s any .NET language such as C#. The problem with these languages is that support garbage collection, so that objects keep moving around in memory, making it difficult to track them using memory breakpoints.

Indeed, the only memory-breakpoint-related feature that DBX supports for Java debugging is setting a breakpoint on modification of class fields:

(dbx) list
>    9       System.in.read();
    10       x = 43;
    11       e.field = x;
    12       }
(dbx) stop access wb Example.field
(3) java stop access wb Example.field
(dbx) cont
stopped in Example.main at line 11 in file "Example.java"
   11       e.field = x;
(dbx) status
(2) java stop inmethod main
*(3) java stop access wb Example.field

However, there is no support for watching a local variable, or watching modifications to the fields of a particular instance.

Visual Studio, on the other hand, has no support at all for setting memory breakpoints on managed objects. And while WinDbg can be tricked into setting such breakpoints using the ba command, they will not follow the object around when the garbage collector decides to move it.

Still, not all is lost. It is possible (at least theoretically) to set a memory breakpoint on a managed object that would follow it around in memory. I can think of at least two approaches—which aren’t easy to implement, don’t get me wrong—which can come close to accomplishing this:

  1. A WinDbg extension can receive some identifying information about the object on which a breakpoint needs to be set (e.g. the name of a local variable in a live stack frame). After each garbage collection—of which the extension can be aware by setting a breakpoint on mscorwks!*RestartEE—the extension can reset the hardware breakpoint using the identifying information above.
  2. [a more robust approach] Use the CLR profiling callback interfaces, specifically the ICorProfilerCallback::MovedReferences method, to implement object tracking and follow the object around memory, resetting the hardware breakpoint every time the object is moved.

If anyone is interested in collaborating on the development of either approach, feel free to let me know through the contact form or the comments :-)


* Curiously, this limitation does not apply to Itanium processors, where a memory access breakpoint can be set on a range of up to 2^55 bytes. [cf. Intel Itanium Manual, Volume 2, Part 1, pp. 2:152-153]

DBX vs. Visual Studio and WinDbg: Part 2A, Configuring Multiple Breakpoints

This is where we are through the series:

  • Calling a function
  • Configuring breakpoints
  • Tracing execution
  • Execution control
  • Displaying data, including STL collections
  • Runtime application checking
  • Miscellaneous commands

Today’s post is about configuring multiple breakpoints at once in a convenient fashion. If you’re used to setting your breakpoints one at a time, you may find it very convenient to reason about breakpoints in a more flexible way, which is something DBX excels at.

DBX
Here are some of the breakpoint types that you can use in DBX to configure multiple breakpoints at once:

stop inmember execute Will insert breakpoints in all methods called “execute”, very useful for virtual methods or when you’re too lazy to type the exact class name
stop inclass vector Will insert breakpoints in all the methods of the class “vector”, and an additional -recurse switch can control whether to consider derived classes
stop inobject &obj Will insert breakpoints in all the member functions of the class that is the type of obj, and stop only when obj is the this parameter (i.e. stop only when the methods are called on the specified instance)

Here’s a couple of examples:

(dbx) restore
stopped in main at line 11 in file "stl.cc"
   11     std::map<int,std::vector<float> > m;
(dbx) list
   11     std::map<int,std::vector<float> > m;
   12     std::vector<float> v;
   13     v.push_back(4.0f);
   14     v.push_back(5.0f);
   15     m[5] = v;
   16     global = 2;
   17     getchar();
   18   }
(dbx) stop inclass std::vector<float,std::allocator<float> >
(3) stop inclass std::vector<float,std::allocator<float> >
(dbx) cont
stopped in std::vector<float, std::allocator, <float>void>::vector at line 182 in file "stl_vector.h"
  182         : _Base(__a) { }
(dbx) cont
stopped in std::vector<float, std::allocator, <float>void>::push_back at line 558 in file "stl_vector.h"
  558       if (this->_M_impl._M_finish != this->_M_impl._M_end_of_storage)
(dbx) where
=>[1] std::vector<float, std::allocator, <float>void>::push_back(this = 0x8047950, __x = 4.0), line 558 in "stl_vector.h"
  [2] main(), line 13 in "stl.cc"

...

(dbx) stop infunction main
(2) stop infunction main
(dbx) run
Running: stl
(process id 3740)
stopped in main at line 13 in file "stl.cc"
   13     v.push_back(4.0f);
(dbx) stop inobject &v2
(3) stop inobject (class vector<float,std::allocator<float> > *) &v2 (0x8047950)
(dbx) cont
stopped in std::vector<float, std::allocator, <float>void>::push_back at line 558 in file "stl_vector.h"
  558       if (this->_M_impl._M_finish != this->_M_impl._M_end_of_storage)
(dbx) where
=>[1] std::vector<float, std::allocator, <float>void>::push_back(this = 0x8047950, __x = 6.0), line 558 in "stl_vector.h"
  [2] main(), line 15 in "stl.cc"

These options are very flexible and work incredibly well—and it seems to me that once you’re used to them, it’s rather hard to let go :-)

WinDbg
The only thing WinDbg has that can be used to match the power of the DBX commands above is the bm command, which configures a symbol breakpoint based on a pattern. The pattern syntax has basic regular expression features such as [A-z], f[0-9]+, and so on.

With this command in mind, the following are examples of how to approximate the inmember and inclass breakpoint types:

0:000> bm *!std::vector*
  1: 01084e40 @!"myapp!std::vector<float,std::allocator<float> >::reserve"
  2: 010836d0 @!"myapp!std::vector<float,std::allocator<float> >::capacity"
  3: 01083720 @!"myapp!std::vector<float,std::allocator<float> >::clear"
  4: 01082630 @!"myapp!std::vector<float,std::allocator<float> >::~vector<float,std::allocator<float> >"
  5: 01083c10 @!"myapp!std::vector<float,std::allocator<float> >::_Orphan_range"
  6: 01083990 @!"myapp!std::vector<float,std::allocator<float> >::_Destroy"
  7: 010851a0 @!"myapp!std::vector<float,std::allocator<float> >::erase"
  8: 010838b0 @!"myapp!std::vector<float,std::allocator<float> >::_Buy"
  9: 010826d0 @!"myapp!std::vector<float,std::allocator<float> >::operator="
10: 01082900 @!"myapp!std::vector<float,std::allocator<float> >::size"
11: 01088440 @!"myapp!std::vector<float,std::allocator<float> >::_Ucopy<float *>"
12: 01087550 @!"myapp!std::vector<float,std::allocator<float> >::_Assign_rv"
13: 01083b30 @!"myapp!std::vector<float,std::allocator<float> >::_Tidy"
14: 01085140 @!"myapp!std::vector<float,std::allocator<float> >::max_size"
15: 01085440 @!"myapp!std::vector<float,std::allocator<float> >::_Grow_to"
16: 01083a00 @!"myapp!std::vector<float,std::allocator<float> >::_Inside"
17: 010824a0 @!"myapp!std::vector<float,std::allocator<float> >::push_back"
18: 01086860 @!"myapp!std::vector<float,std::allocator<float> >::vector<float,std::allocator<float> >"
19: 01082430 @!"myapp!std::vector<float,std::allocator<float> >::vector<float,std::allocator<float> >"
20: 010854f0 @!"myapp!std::vector<float,std::allocator<float> >::_Xlen"
21: 01086930 @!"myapp!std::vector<float,std::allocator<float> >::_Make_iter"
22: 01083a80 @!"myapp!std::vector<float,std::allocator<float> >::_Reserve"
23: 01085020 @!"myapp!std::vector<float,std::allocator<float> >::begin"
24: 010850b0 @!"myapp!std::vector<float,std::allocator<float> >::end"
25: 01088790 @!"myapp!std::vector<float,std::allocator<float> >::_Umove<float *>"
26: 01087cf0 @!"myapp!std::vector<float,std::allocator<float> >::get_allocator"

0:000> bm /a *!*CreateProcess*
27: 61c51118 @!"MSVCR100D!_imp__CreateProcessW"
28: 61c5110c @!"MSVCR100D!_imp__CreateProcessA"
29: 751ec9c5 @!"kernel32!CreateProcessAsUserW"
30: 751d4a7a @!"kernel32!BasepCreateProcessParameters"
31: 751d37bf @!"kernel32!BasepConstructSxsCreateProcessMessage"
32: 751c0654 @!"kernel32!_imp__RtlCreateProcessParametersEx"
33: 751d4f21 @!"kernel32!BasepReleaseSxsCreateProcessUtilityStruct"
34: 75243b81 @!"kernel32!NtVdm64CreateProcessInternalW"
35: 751c8de0 @!"kernel32!NtWow64CsrBasepCreateProcess"
36: 751dad8f @!"kernel32!BasepSxsCreateProcessCsrMessage"
37: 751d5115 @!"kernel32!CsrBasepCreateProcess"
38: 751d3bf3 @!"kernel32!CreateProcessInternalW"
39: 751c1072 @!"kernel32!CreateProcessA"
40: 751c103d @!"kernel32!CreateProcessW"
41: 751da4b7 @!"kernel32!CreateProcessInternalA"
42: 770215ac @!"KERNELBASE!_imp__RtlpCreateProcessRegistryInfo"
43: 77026afc @!"KERNELBASE!NtWow64CsrBasepCreateProcess"
44: 77a7ffdc @!"ntdll!NtCreateProcessEx"
45: 77a80804 @!"ntdll!NtCreateProcess"
46: 77a980b7 @!"ntdll!RtlpCreateProcessRegistryInfo"
breakpoint 45 redefined
45: 77a80804 @!"ntdll!ZwCreateProcess"
breakpoint 44 redefined
44: 77a7ffdc @!"ntdll!ZwCreateProcessEx"
47: 77b01d35 @!"ntdll!RtlCreateProcessReflection"
48: 77b0e7ab @!"ntdll!RtlCreateProcessParameters"
49: 77aabd9b @!"ntdll!RtlCreateProcessParametersEx"

However, the inobject breakpoint type is harder to approximate, because you need a conditional breakpoint that would verify the value of the this parameter:

0:000> dv /i /V
prv param  0024fd50 @ebp+0x08 argc = 1
prv param  0024fd54 @ebp+0x0c argv = 0x00554b18
...
prv local  0024fcfc @ebp-0x4c v = class std::vector<float,std::allocator<float> >

0:000> bm *!std::vector<float* ".if (@ecx == 0024fcfc) {} .else {gc}"
  2: 01154e40 @!"myapp!std::vector<float,std::allocator<float> >::reserve"
  3: 011536d0 @!"myapp!std::vector<float,std::allocator<float> >::capacity"
  4: 01153720 @!"myapp!std::vector<float,std::allocator<float> >::clear"
...

0:000> g
eax=0024fc00 ebx=7efde000 ecx=0024fcfc edx=0024fcfc esi=0024fbec edi=0024fd3c
eip=011524a0 esp=0024fbe4 ebp=0024fd48 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246
myapp!std::vector<float,std::allocator<float> >::push_back:
011524a0 55              push    ebp

One disadvantage of the WinDbg way is that it will later require you to remove the breakpoints one-by-one, while in DBX each command you issued is translated to a single “unified” breakpoint that you can remove.

Visual Studio
Visual Studio does not have a built-in way to configure a breakpoint on a particular object or all the methods of a particular class. However, this is one example of a situation where Visual Studio macros may come in handy.

Here is one solution, from stackoverflow’s Richard Szalay:

Sub AddBreakpointInAllClassMembers() 
  Dim debugger As EnvDTE.Debugger = DTE.Debugger
  Dim sel As EnvDTE.TextSelection = _
     
DTE.ActiveDocument.Selection
  Dim editPoint As EnvDTE.EditPoint = _
     
sel.ActivePoint.CreateEditPoint()
  Dim classElem As EnvDTE.CodeElement = _
      editPoint.CodeElement(vsCMElement.vsCMElementClass) 

  If Not classElem Is Nothing Then
    For Each member As EnvDTE.CodeElement _
                    In classElem.Children
      If member.Kind = vsCMElement.vsCMElementFunction Then
        debugger.Breakpoints.Add(member.FullName)
      End If
    Next
  End If
End Sub

It has the obvious limitation of not working on arbitrary symbols, and requiring source code to actually set the breakpoints. If the former is desired, I suppose it’s possible to integrate the .x Immediate Window command with a macro to set breakpoints automatically similarly to WinDbg’s bm. The .x command performs a symbol lookup like the x command in WinDbg, for example:

.x myapp!*push_back
0x012724f0 myapp!std::vector<float,std::allocator<float> >::push_back

Clearly, it would not be difficult to add to the above macro support for breaking only when a specific object’s methods are called (e.g. by specifying a condition on this). For the specific situation when you want a single-method breakpoint on a specific object, the documentation specifies there is an easier workaround: if you want a breakpoint on the method MyClass::foo only when called on the instance x, then use the Immediate Window to find &x and then create a function breakpoint on

((MyClass*)0x1234abcd)->foo

Unfortunately, from my testing (with Visual Studio 2010 Ultimate), this simply doesn’t work. If any of you had ever seen this working, I’d love to hear about it :-)

DBX vs. Visual Studio and WinDbg: Part 1, Calling Functions

I’ve recently had an enlightening experience teaching a C++ Debugging course to a group of developers who are transitioning from a Solaris environment to Windows and Visual Studio. This hasn’t been an easy transition for them, and the course wasn’t easy to teach—specifically, because one of the most discussed topics was feature parity between DBX (the debugger of choice for C++ applications on Solaris) and Visual Studio. Fortunately, the course focuses on WinDbg, which has alternatives to several debugging features that are inaccessible from Visual Studio; and enabled me to address at least partially the pain points and missing features after leaving DBX.

Following the course, I decided to write a series of posts outlining the unique features of DBX and how they can be emulated using Visual Studio and WinDbg. The purpose is not to convert us all to loyal DBX users, but rather to see how some features we may never have considered are taken for granted on other platforms.

Some ideas I have for this blog series:

  • Calling a function [this post]
  • Configuring breakpoints
  • Tracing execution
  • Execution control
  • Displaying data, including STL collections
  • Runtime application checking
  • Miscellaneous commands

In this post we’ll discuss a fairly useful feature—calling a function in the middle of the debugging session. The function may belong to the current execution path, or any other library that is currently loaded into the process.

DBX
DBX makes calling a function from the middle of the debugging session rather easy with the “call” command, and supports virtual methods, static functions, and arbitrary parameters. You can even set a breakpoint in a function you call that way, and stop to examine the program’s state. For example:

(dbx) list
   11     std::map<int,std::vector<float> > m;
   12     std::vector<float> v;
   13     v.push_back(4.0f);
   14     v.push_back(5.0f);
   15     m[5] = v;
   16     global = 2;
   17     getchar();
   18   }
...
(dbx) next
stopped in main at line 16 in file "stl.cc"
   16     global = 2;
(dbx) stop inmember size
(3) stop inmember size
(dbx) print m[5].size()
stopped in std::vector<float, std::allocator, <float>void>::size at line 375 in file "stl_vector.h"
  375   size() const { return size_type(end() - begin()); }
dbx: Stopped within call to 'size'. Starting new command interpreter
(dbx) where
=>[1] std::vector<float, std::allocator, <float>void>::size(this = 0x8067a54), line 375 in "stl_vector.h"
  ---------- called from debugger ----------
  [2] main(), line 16 in "stl.cc"
(dbx) pop -c
dbx: Call to 'size' aborted. Going back to previous command interpreter

Visual Studio
The Visual Studio equivalent of this feature is, of course, the Immediate Window. Unfortunately, the limits of what does and what doesn’t work in the Immediate Window are rather vague, and experimentation is in order. First of all, you can call virtual methods, static methods, or whatever else you want on local variables, memory addresses, etc., and pass to them parameters. There are some limitations e.g. on overloaded operators, but generally you can live with that. Here are some things you can expect to work:

v.capacity() - v.size()
2

m
[1]((5, [2](4.0000000,5.0000000)))
    [comp]: less
    [0]: (5, [2](4.0000000,5.0000000))

((std::vector<float,std::allocator<float> >*)(0x0046fb28))->reserve(50)
<void>

To call a function from another library, the strange {,,} context specification syntax needs to be used:

{,,msvcr100d}printf("Hello World!\n")
13

Neither I nor other bloggers I found on the web have been able to decipher the context specification syntax fully, other than the fact that the last part is used to specify the name of a DLL. The official documentation on this seems simply wrong.

Mixing and matching the context specification evaluator and local variable names does not work; neither does mixing a standard function call (like v.size() above) in an expression that uses a context specification.

Also contrary to the documentation, breakpoints that you set (or even DebugBreak calls) in the invoked function are not fired, so you cannot effectively inspect the executed code. An alternative would be moving the instruction pointer to the function you want to execute using the Set Next Statement command, and then retracting the instruction pointer—this is very fragile, but may work under certain circumstances.

All in all, the Immediate Window support for function invocation is rather half-baked.

WinDbg
On the surface, WinDbg has no support at all for calling arbitrary functions from the middle of the debugging session. Fortunately, there is a debugging extension called SDbgExt which has been very useful to me in the past. This extension makes available commands for loading DLLs dynamically and calling methods on them from the debugger session, namely the !remotecall command.

Calling a global function is rather easy with SDbgExt, but calling a member function on a local variable—something even the Immediate Window excels at—is not trivial, because you need to pass the this parameter to the method manually. However, breakpoints in called functions work seamlessly, because what SDbgExt does is create a thread that calls the function and returns a result. For example:

0:001> bm "myapp!std::vector<float,std::allocator<float> >::size"
  1: 01292900 @!"myapp!std::vector<float,std::allocator<float> >::size"

0:001> !remotecall
Usage: !remotecall <address> <call-conv> [arguments]
Calling conventions are specified as integral values: stdcall(0), cdecl(1), fastcall(2)

0:000> dv /t /v
0030f7d4 int argc = 1
0030f7d8 wchar_t ** argv = 0x001e84c0
...
0030f780 class std::vector<float,std::allocator<float> > v ...

0:000> bl
1 e 01292900 [c:\program files (x86)\microsoft visual studio 10.0\vc\include\vector @ 878]    0001 (0001)  0:**** myapp!std::vector<float,std::allocator<float> >::size

0:000> !remotecall 01292900 0 0030f780
myapp!std::vector<float,std::allocator<float> >::size() will be run when execution is resumed

0:000> g
...
myapp!std::vector<float,std::allocator<float> >::size() [conv=0 argc=4 argv=00030488]
Breakpoint 1 hit
...
myapp!std::vector<float,std::allocator<float> >::size:
01292900 55              push    ebp

0:002> k
ChildEBP RetAddr 
00b6f974 00f01052 myapp!std::vector<float,std::allocator<float> >::size [c:\program files (x86)\microsoft visual studio 10.0\vc\include\vector @ 878]
WARNING: Stack unwind information not available. Following frames may be wrong.
00b6f9f0 0003046a sdbgext+0x1052
00b6f9f4 00030000 0x3046a
00b6f9f8 00000000 0x30000

Although it isn’t perfect, and definitely not as convenient and DBX’s call command, !remotecall is a useful and predictable alternative.

Microsoft MVP, Third Time’s a Charm

I just received my renomination letter for the Microsoft MVP award for Visual C#, 2011. This is the third year in a row, and what a blast these years have been!

I firmly believe that 2011 could be an amazing year for C#; what with the introduction of async methods in C# 5, which is a clear demonstration of how frameworks and fluent interfaces pale in comparison to the convenience and power of language keywords.

This time I am humbly receiving the MVP award as the CTO of Sela Group, where I am constantly amazed by the number and talent of technical folks who are taking us to new challenges in Israel and abroad. So in conclusion, I’d like to thank some good friends and colleagues:

  • David Bassa and Ishai Ram, my managers and friends at Sela, who work with me at easy times and hard, and let me pursue whatever direction I consider important at the time;
  • Guy Burstein, who leads the local Microsoft community by example and continues to be, after a few years with DPE, a role model of a technical evangelist;
  • My colleagues, Sela’s top technical experts—too numerous to list here, who never fail to surprise with yet another area of expertise and yet another great answer.

Finally, to my readers—thanks, and stay tuned for 2011, 2012, and many more years to come :-)