April 2010 - Posts
Earlier this week I had to propagate a System.Transactions transaction across .NET AppDomains. If you use a TransactionScope and rely on the ambient transaction propagation, you’ll find that the transaction does not cross the AppDomain boundary. This actually makes sense from an isolation perspective—transactions represent an isolation boundary and have their own failure semantics, while AppDomains represent another isolation boundary and have other failure semantics.
There are several ways to propagate a transaction across AppDomains, but bear in mind that regardless of the method you choose, the transaction will be promoted to a distributed transaction when it flows across AppDomains. This means that if you’re relying on a lightweight resource manager that can’t handle distributed transactions, you’ll have to seek other workarounds.
The easiest way to pass a transaction across the AppDomain boundary is to pass the Transaction instance to the method that resides in another AppDomain. In the other AppDomain, you can create a new TransactionScope and instruct it to use that transaction.
Here’s an example of a method that uses this technique inside another AppDomain:
class RunsInAnotherAppDomain : MarshalByRefObject
{
public void Method1(Transaction txToUse)
{
using (TransactionScope scope = new TransactionScope (txToUse))
{
TransactionInformation txInfo = Transaction.Current.TransactionInformation;
Console.WriteLine("\tSecondary AppDomain:" );
Console.WriteLine("\t\tLocal identifier: " + txInfo.LocalIdentifier);
Console.WriteLine("\t\tDistributed identifier: " + txInfo.DistributedIdentifier);
Transaction.Current.Rollback();
}
}
}
Another way is to retrieve a transaction propagation token, which is a byte[], and pass it to the method that resides in another AppDomain. In the first AppDomain you’ll have to use the TransactionInterop.GetTransmitterPropagationToken method and in the second AppDomain you’ll have to use the TransactionInterop.GetTransactionFromTransmitterPropagationToken method.
Here’s an example of a method that uses a transaction propagation token inside another AppDomain:
class RunsInAnotherAppDomain : MarshalByRefObject
{
public void Method2(byte [] txToken)
{
Transaction txToUse =
TransactionInterop.GetTransactionFromTransmitterPropagationToken(txToken);
using (TransactionScope scope = new TransactionScope (txToUse))
{
TransactionInformation txInfo = Transaction.Current.TransactionInformation;
Console.WriteLine("\tSecondary AppDomain:" );
Console.WriteLine("\t\tLocal identifier: " + txInfo.LocalIdentifier);
Console.WriteLine("\t\tDistributed identifier: " + txInfo.DistributedIdentifier);
Transaction.Current.Rollback();
}
}
}
Finally, you can rely on WCF to pass the transaction for you—if your cross-AppDomain communication does not use Remoting (MarshalByRefObject) but uses WCF, you can configure WCF transaction flow and use that to promote and propagate your transaction.
You can download the complete example (a Visual Studio 2010 solution) here. This is the sample output with two transactions that are propagated to the secondary AppDomain and aborted inside it. (As you can see, the distributed transaction identifier appears after the transaction is passed to the secondary AppDomain.)
Passing Transaction object:
Main AppDomain, before cross-domain call:
Local identifier: f31f3e8b-c78d-43a9-8642-28c987bfeb3a:1
Distributed identifier: 00000000-0000-0000-0000-000000000000
Secondary AppDomain:
Local identifier: f0ac61d7-430d-481c-a559-544f9d0de6d7:1
Distributed identifier: 02700344-be0e-404b-bab3-cfb5926301e7
Main AppDomain, after cross-domain call:
Local identifier: f31f3e8b-c78d-43a9-8642-28c987bfeb3a:1
Distributed identifier: 02700344-be0e-404b-bab3-cfb5926301e7
The transaction has aborted.
Passing transaction propagation token:
Main AppDomain, before cross-domain call:
Local identifier: f31f3e8b-c78d-43a9-8642-28c987bfeb3a:2
Distributed identifier: 00000000-0000-0000-0000-000000000000
Secondary AppDomain:
Local identifier: f0ac61d7-430d-481c-a559-544f9d0de6d7:2
Distributed identifier: 5a182389-9d9b-4b91-a1f4-12af1a454586
Main AppDomain, after cross-domain call:
Local identifier: f31f3e8b-c78d-43a9-8642-28c987bfeb3a:2
Distributed identifier: 5a182389-9d9b-4b91-a1f4-12af1a454586
The transaction has aborted.
A few days ago I demonstrated a debugging scenario where one thread attempts to call a virtual function on an object whose destructor is running on another thread. The base destructor restores the virtual function table to that of the base class, which causes the other thread to call a pure virtual function.
The code that I used to reproduce this scenario is really simple, but it’s not (at least for me :-)) immediately evident from reading it that this bug is lurking in the shadows. Here’s the code, anyway:
#include <Windows.h>
#include <iostream>
class thread {
private :
HANDLE h_;
void * context_;
static DWORD WINAPI thread_start(LPVOID p) {
thread* pThis = reinterpret_cast <thread*>(p);
pThis->work();
return 0;
}
public :
void start(void * context = nullptr ) {
context_ = context;
h_ = CreateThread(NULL, 0, thread_start, this , 0, NULL);
}
virtual ~thread() {
WaitForSingleObject(h_, INFINITE);
CloseHandle(h_);
}
protected :
virtual void work() = 0;
void * context() { return context_; }
};
class my_thread : public thread {
protected :
virtual void work() {
std::cout << "Inside the thread" << std::endl;
}
};
int main(int argc, char * argv[]) {
{
my_thread t;
t.start();
}
return 0;
}
As you can see, the destructor of the my_thread object declared in the main thread might run (and restore the virtual function table to that of the base class, thread) before the static thread_start method has a chance to call the work virtual method. This causes the pure virtual function call.
A quick post to (maybe) save some of you some troubleshooting time one day.
Executive summary: If your Visual Studio unit tests work when a debugger is attached and fail when there’s no debugger attached, make sure that the test is running the latest version of your code, and if it’s not, try turning off code coverage in your test run configuration.
During the last two weeks I wasted at least an accumulated 5 hours trying to understand why my Visual Studio unit tests fail even after I fixed the bug. In particular, I couldn’t wrap my head around why they fail when I run them without a debugger attached, and succeed spectacularly when running under a debugger.
What I eventually found is that my TestResults\…\Out directory, which contains the assemblies used by the test, has a stale version of my code in it—i.e., doesn’t contain the fixes I just made to address the failing unit tests. I figured this out by actually opening the assembly with Reflector and seeing, much to my horror, that my code changes haven’t made it into the binary.
Following Guy’s advice I ran mstest from the command line and it generated quite a few warnings about items that could not be deployed because they conflict with items that have been instrumented for code coverage. I turned off code coverage, and the problem went away.
How badly, on a scale of 1 to 10, do you like this error message?
Not the friendliest one to see, I bet, when you’re trying to make your way through a day without calling any pure virtual functions… Well, a pure virtual function call is not a rare thing to see in the C++ world, and there are numerous reasons for something like this to happen, not the least of which being calling a pure virtual method from the base class’ constructor. In this post we’re going to see something slightly less traditional. Let’s try to verify what’s going on here.
First, I captured a dump of the application at the moment this dialog was shown. On Vista and above it’s really easy using Task Manager; on earlier OS versions you can use cdb, ntsd, WinDbg, ADPlus—whatever suits you best.
Next, I opened the dump in my trusty old friend WinDbg. Just a couple of threads here:
0:000> ~*
. 0 Id: 1b40.c98 Suspend: 1 Teb: 7efdd000 Unfrozen
Start: DestructorsAndVirtualMethods!ILT+420(_mainCRTStartup) (013b11a9)
Priority: 0 Priority class: 32 Affinity: 3
1 Id: 1b40.140c Suspend: 1 Teb: 7efda000 Unfrozen
Start: DestructorsAndVirtualMethods!ILT+265(?thread_startthreadCGKPAXZ) (013b110e)
Priority: 0 Priority class: 32 Affinity: 3
How about these two threads, then? Well, the main thread most certainly didn’t show me that nasty message box:
0:000> k
ChildEBP RetAddr
0020f9c8 757b0816 ntdll!NtWaitForSingleObject+0x15
0020fa34 76031184 KERNELBASE!WaitForSingleObjectEx+0x98
0020fa4c 76031138 kernel32!WaitForSingleObjectExImplementation+0x75
0020fa60 013b18ad kernel32!WaitForSingleObject+0x12
0020fb48 013b184b DestructorsAndVirtualMethods!thread::~thread+0x3d
0020fc28 013b15c5 DestructorsAndVirtualMethods!my_thread::~my_thread+0x2b
0020fd20 013b2c0f DestructorsAndVirtualMethods!main+0x65
0020fd70 013b2a3f DestructorsAndVirtualMethods!__tmainCRTStartup+0x1bf
0020fd78 76033677 DestructorsAndVirtualMethods!mainCRTStartup+0xf
0020fd84 77a19d72 kernel32!BaseThreadInitThunk+0xe
0020fdc4 77a19d45 ntdll!__RtlUserThreadStart+0x70
0020fddc 00000000 ntdll!_RtlUserThreadStart+0x1b
(It’s in the middle of my_thread’s destructor, which called thread’s destructor, which is waiting for something.) But the second thread most certainly did:
0:001> k
ChildEBP RetAddr
009d548c 76442674 user32!NtUserWaitMessage+0x15
009d54c8 7644288a user32!DialogBox2+0x222
009d54f4 7647f8d0 user32!InternalDialogBox+0xe5
009d55a8 7647fbac user32!SoftModalMessageBox+0x757
009d5700 7647fcaf user32!MessageBoxWorker+0x269
009d576c 7647fea5 user32!MessageBoxTimeoutW+0x52
009d578c 7647fee7 user32!MessageBoxExW+0x1b
009d57a8 5272a58c user32!MessageBoxW+0x18
009d5800 52724a2c MSVCR100D!__crtMessageBoxW+0x20c
009d7a6c 5272bcf0 MSVCR100D!__crtMessageWindowW+0x3bc
009dfb0c 52724662 MSVCR100D!_VCrtDbgReportW+0x8e0
009dfb2c 5272461b MSVCR100D!_CrtDbgReportWV+0x22
009dfb54 526589ee MSVCR100D!_CrtDbgReportW+0x2b
009dfd90 527290c5 MSVCR100D!_NMSG_WRITE+0x5e]
009dfda0 013b1703 MSVCR100D!_purecall+0x25
009dfe80 76033677 DestructorsAndVirtualMethods!thread::thread_start+0x33
009dfe8c 77a19d72 kernel32!BaseThreadInitThunk+0xe
009dfecc 77a19d45 ntdll!__RtlUserThreadStart+0x70
009dfee4 00000000 ntdll!_RtlUserThreadStart+0x1b
Hmm. That’s too much of a coincidence—the main thread is inside the thread class destructor, and the secondary thread is inside the thread_start method… That method performs the pure virtual function call. Could the two threads be working on the same object?
Here’s the offending disassembly of the secondary thread:
013b16fb 8b4df8 mov ecx,dword ptr [ebp-8]
013b16fe 8b4204 mov eax,dword ptr [edx+4]
013b1701 ffd0 call eax
013b1703 3bf4 cmp esi,esp
So we should focus on EBP-8, and in there we find:
0:001> kn
# ChildEBP RetAddr
00 009d548c 76442674 user32!NtUserWaitMessage+0x15
01 009d54c8 7644288a user32!DialogBox2+0x222
02 009d54f4 7647f8d0 user32!InternalDialogBox+0xe5
03 009d55a8 7647fbac user32!SoftModalMessageBox+0x757
04 009d5700 7647fcaf user32!MessageBoxWorker+0x269
05 009d576c 7647fea5 user32!MessageBoxTimeoutW+0x52
06 009d578c 7647fee7 user32!MessageBoxExW+0x1b
07 009d57a8 5272a58c user32!MessageBoxW+0x18
08 009d5800 52724a2c MSVCR100D!__crtMessageBoxW+0x20c
09 009d7a6c 5272bcf0 MSVCR100D!__crtMessageWindowW+0x3bc
0a 009dfb0c 52724662 MSVCR100D!_VCrtDbgReportW+0x8e0
0b 009dfb2c 5272461b MSVCR100D!_CrtDbgReportWV+0x22
0c 009dfb54 526589ee MSVCR100D!_CrtDbgReportW+0x2b
0d 009dfd90 527290c5 MSVCR100D!_NMSG_WRITE+0x5e
0e 009dfda0 013b1703 MSVCR100D!_purecall+0x25
0f 009dfe80 76033677 DestructorsAndVirtualMethods!thread::thread_start+0x33
10 009dfe8c 77a19d72 kernel32!BaseThreadInitThunk+0xe
11 009dfecc 77a19d45 ntdll!__RtlUserThreadStart+0x70
12 009dfee4 00000000 ntdll!_RtlUserThreadStart+0x1b
0:001> .frame /r 0f
0f 009dfe80 76033677 DestructorsAndVirtualMethods!thread::thread_start+0x33
eax=00000000 ebx=0020fd04 ecx=00000000 edx=00000000 esi=009dfda8 edi=009dfe80
eip=013b1703 esp=009dfda8 ebp=009dfe80 iopl=0 nv up ei pl nz na po nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202
DestructorsAndVirtualMethods!thread::thread_start+0x33:
013b1703 3bf4 cmp esi,esp
0:001> dd 009dfe80-0x8 L1
009dfe78 0020fd04
This guy is a thread object (actually, a my_thread object), so let’s take a look at it:
0:001> dt DestructorsAndVirtualMethods!thread 0020fd04
+0x000 __VFN_table : 0x013b785c
+0x004 h_ : 0x0000002c
+0x008 context_ : (null)
And what’s in that table over at offset 4?
0:001> dd 0x013b785c+0x4 L1
013b7860 013b11d1
0:001> ln 013b11d1
(013b11d1) DestructorsAndVirtualMethods!ILT+460(__purecall) | (013b11d6) DestructorsAndVirtualMethods!ILT+465(__RTC_CheckEsp)
Exact matches:
Yikes. That’s __purecall, our good friend. So how come we have a screwed up vtable in a perfectly legitimate object method? Let’s take a look at the main thread:
0:000> u DestructorsAndVirtualMethods!thread::~thread L20
DestructorsAndVirtualMethods!thread::~thread:
013b1870 55 push ebp
013b1871 8bec mov ebp,esp
013b1873 81eccc000000 sub esp,0CCh
013b1879 53 push ebx
013b187a 56 push esi
013b187b 57 push edi
013b187c 51 push ecx
013b187d 8dbd34ffffff lea edi,[ebp-0CCh]
013b1883 b933000000 mov ecx,33h
013b1888 b8cccccccc mov eax,0CCCCCCCCh
013b188d f3ab rep stos dword ptr es:[edi]
013b188f 59 pop ecx
013b1890 894df8 mov dword ptr [ebp-8],ecx
013b1893 8b45f8 mov eax,dword ptr [ebp-8]
013b1896 c7005c783b01 mov dword ptr [eax],offset DestructorsAndVirtualMethods!thread::`vftable' (013b785c)
013b189c 8bf4 mov esi,esp
013b189e 6aff push 0FFFFFFFFh
013b18a0 8b45f8 mov eax,dword ptr [ebp-8]
013b18a3 8b4804 mov ecx,dword ptr [eax+4]
013b18a6 51 push ecx
013b18a7 ff1550b23b01 call dword ptr [DestructorsAndVirtualMethods!_imp__WaitForSingleObject (013bb250)]
013b18ad 3bf4 cmp esi,esp
Do you see that huge prologue? One of the things it does is restore the vptr to the base class’ vtable! So while the destructor is running, the object has the base vptr which contains the pure virtual method. Voila.
If I lost you somewhere in the debugging spew above, here’s a quick recap:
- Thread 0 enters the object’s destructor. The derived destructor runs (in this case, it’s empty) and then the base destructor runs.
- The base destructor restores the vptr to point to the base class’ vtable, and enters a wait.
- Thread 1 attempts to call a virtual method that is pure virtual in the base class and overridden in the derived class. The vtable belongs to the base class, so there’s a __purecall entry in it that shows us the beautiful message box we started with.
A completely different approach is what you’d do when you have the debugger and the source code handy. In that case, simply set a data breakpoint in Visual Studio (or WinDbg) on the address of the object’s vptr:
And then let the program run until it stops in the debugger—inspecting the disassembly immediately shows the same result we’ve seen in WinDbg:
Which gives away the culprit, again—the base class destructor restoring the base class vtable prior to entering the destructor code.
A few days ago I wrote about my sessions at the MCT Virtual Summit 2010. The session recordings have been made publicly available using LiveMeeting, so even if you didn’t attend the conference you can tune it at:
- 50150 and 50166: C# 3.0, Programming the .NET Framework 3.5 and a glimpse towards Parallel Programming in Visual Studio 2010
- 50153: .NET Performance
When clicking the links, enter your name and leave the recording key field blank, for example:
Next, fill in your email address and company name, and you’ll be taken to a screen similar to the following, where you can choose between downloading the session and streaming it live in Media Player or in the LiveMeeting web client:
And here’s a screenshot from one of the sessions. Yes, there’s a live webcam of yours truly throughout the session :-)

Our book, Introducing Windows 7 for Developers, has quite a few pages on how to interact with the Windows 7 Taskbar and Libraries using the Windows Shell interfaces.
Unfortunately, the section that introduces the Windows Shell and lists the primary shell COM interfaces has ended up on page 89, in the Libraries chapter (Chapter 4), even though it’s cross-referenced elsewhere as if it appears in Chapter 1. (I noticed the omission thanks to this review.)
If you were looking for that section to brush up your Windows Shell programming skills, it’s right there on page 89, titled “Working with the Shell Namespace”, and starting with the following paragraph:
Before we dive into the Shell Libraries programming model, we need to understand how the Windows Shell works. The Windows Shell is the main area of user interaction in Windows. The Shell includes user-facing Windows feature such as the Taskbar, the Start Menu, Windows Explorer windows, search results, and even less obvious windows such as the Control Panel. The shell is hosted in a process called explorer.exe, and most users recognize it as Windows Explorer.
One of the most cited usability problems is when the actual implementation response time is very different from the perceived and expected response time. In most cases, the implementation response time is slower than expected.
Yesterday I had an interesting observation to the contrary. As a developer, I understand what the “actual” implementation response time should be, because I understand the runtime complexity of an operation and can bring into consideration other factors such as network latency and server load. On the other hand, the “average user’s” perceived response time might be significantly lower or higher than the implementation response time.
In this case, I would feel uncomfortable with a certain application even if works faster than it should, because my understanding of the implementation response time is different from the actual implementation.
One such example is the introduction of local caching to web GMail on mobile devices a couple of days ago. Composing a message with slow network connectivity or archiving a message with slow network connectivity is something I perceive to have a slow implementation, because I imagine that this operation requires a network round trip. However, thanks to local caching, the operation completes instantaneously. My immediate mental response was, “I probably did something wrong and that’s why it completed so quickly”. On the other hand, an “average webmail user” might have a different reaction, along the lines of, “at last!—GMail responds immediately to my commands, as it should have from day 1”.
I assume most of my readers are software developers. Did you encounter this kind of disparity using your own applications or someone else’s applications?
More Windows 7 videos that I recorded have been posted on Channel 9. This installment deals with Windows 7 shell libraries and Federated Search, a subject that I haven’t covered in depth on my blog. The videos feature an introduction to shell libraries and demonstrate how to integrate your application with Windows 7 libraries from managed and native code, how to register for library change notifications (if you care about changes to library contents), and how to use Federated Search to integrate your search provider into Windows Explorer.
[I don’t remember if I told you this in my last post, but all videos have an accompanying code download which is the precise version of the code that I used when recording these videos.]
Introduction to Windows 7 Libraries
Integrating with Shell Libraries, Part 1
Integrating with Shell Libraries, Part 2
Federated Search
Stay tuned for more! :-)
A couple of years ago I gave a cursory visit to the subject of diagnosing a Monitor-based deadlock in a managed application, and then a few months ago I demonstrated in greater detail how to locate the synchronization object your thread is waiting for using SOS.
[Side note: It’s always easy to see the list of currently owned Monitors (sync blocks) using the SOS !SyncBlk command, which also tells you which thread owns each synchronization object. The hard part is to find the synchronization object for which your thread is waiting if you don’t have any prior knowledge of this particular lock your application is using.]
It’s always good to have another way of doing the same thing, in case something doesn’t work out in the debugger. This post shows a third approach to detecting the synchronization object for which your thread is waiting.
After attaching to the process, use !Threads, !CLRStack, and !SyncBlk as usual to see the general picture. In this example, here are the threads (edited for brevity):
0:004> !Threads
ThreadCount: 3
...
ID OSID APT Exception
0 1 1320 MTA
2 2 fec MTA (Finalizer)
3 3 18d8 MTA (Threadpool Worker)
OK, so we have a few of threads here, one is the main thread, the other is the finalizer thread, and the third is a thread pool thread. Here’s the call stack for the application’s threads (edited for brevity, e.g. I removed the unmanaged threads):
0:004> ~* e !CLRStack
OS Thread Id: 0x1320 (0)
Child-SP RetAddr Call Site
000000000013ee10 000007fef7d8d502 LocksAndLocks.Program.Main()
OS Thread Id: 0xfec (2)
Failed to start stack walk: 80004005
OS Thread Id: 0x18d8 (3)
Child-SP RetAddr Call Site
000000001aecf510 000007fef6c72bbb …c__DisplayClass1.<Main>b__0(…)
000000001aecf560 000007fef6ce7411 System.Threading.ExecutionContext.Run(…)
000000001aecf5b0 000007fef6ce724f …PerformWaitCallbackInternal(…)
000000001aecf600 000007fef7d8d502 …PerformWaitCallback(…)
“Failed to start stack walk”—in other words, the finalizer thread is not currently executing any managed code. Not interesting.
And here are the sync blocks (edited for brevity):
0:004> !syncblk
SyncBlock Owning Thread SyncBlock Owner
0000000000c38408 3 0000000002683a98 System.String
What is this string anyway?
0:003> !do 0000000002683a98
Name: System.String
MethodTable: 000007fef6daec90
EEClass: 000007fef69bb038
Size: 36(0x24) bytes
(C:\Windows\assembly\GAC_64\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll)
String: Hello
…
This is a 64-bit process, so from looking at the managed call stacks it wouldn’t immediately become obvious that there’s a synchronization issue here. Fortunately, the unmanaged call stack comes to the rescue. Here’s thread 3:
0:003> kc
Call Site
ntdll!NtDelayExecution
KERNELBASE!SleepEx
mscorwks!EESleepEx
mscorwks!Thread::UserSleep
mscorwks!ThreadNative::Sleep
0x0
mscorlib_ni
mscorlib_ni
mscorlib_ni
mscorwks!CallDescrWorker
mscorwks!CallDescrWorkerWithHandler
mscorwks!DispatchCallDebuggerWrapper
mscorwks!DispatchCallNoEH
mscorwks!QueueUserWorkItemManagedCallback
mscorwks!Thread::DoADCallBack
mscorwks!SVR::gc_heap::make_heap_segment
mscorwks!AssemblySecurityDescriptor::GetZone
mscorwks!ThreadNative::KickOffThread
mscorwks!ManagedPerAppDomainTPCount::DispatchWorkItem
mscorwks!ThreadpoolMgr::WorkerThreadStart
Not particularly interesting, other than that it’s currently in a sleep call. It’s fairly easy to figure out the timeout by inspecting the managed IL:
0:003> !Name2EE *!LocksAndLocks.Program+<>c__DisplayClass1.<Main>b__0
…
Module: 000007ff000533d0 (LocksAndLocks.exe)
Token: 0x0000000006000004
MethodDesc: 000007ff00053a58
Name: LocksAndLocks.Program+<>c__DisplayClass1.<Main>b__0(System.Object)
JITTED Code Address: 000007ff001a02d0
0:003> !DumpIL 000007ff00053a58
ilAddr = 00000000011c2058
IL_0000: nop
IL_0001: ldarg.0
IL_0002: ldfld <>c__DisplayClass1::s
IL_0007: dup
IL_0008: stloc.0
IL_0009: call System.Threading.Monitor::Enter
IL_000e: nop
.try
{
IL_000f: nop
IL_0010: ldc.i4.m1
IL_0011: call System.Threading.Thread::Sleep
IL_0016: nop
IL_0017: nop
IL_0018: leave.s IL_0022
} // end .try
.finally
{
IL_001a: ldloc.0
IL_001b: call System.Threading.Monitor::Exit
IL_0020: nop
IL_0021: endfinally
} // end .finally
IL_0022: nop
IL_0023: ret
Note the highlighted line—the parameter passed to Thread.Sleep is –1, which is the infinite timeout. So this thread isn’t going anywhere for the time being, and from the !SyncBlk output we remember that it owns sync block 0000000000c38408, whose owner is a string at 0000000002683a98.
What about thread 0?
0:000> kc
Call Site
ntdll!NtWaitForMultipleObjects
KERNELBASE!WaitForMultipleObjectsEx
KERNEL32!WaitForMultipleObjectsExImplementation
mscorwks!WaitForMultipleObjectsEx_SO_TOLERANT
mscorwks!Thread::DoAppropriateAptStateWait
mscorwks!Thread::DoAppropriateWaitWorker
mscorwks!Thread::DoAppropriateWait
mscorwks!CLREvent::WaitEx
mscorwks!AwareLock::EnterEpilog
mscorwks!AwareLock::Enter
mscorwks!JIT_MonEnterWorker_Portable
0x0
mscorwks!CallDescrWorker
mscorwks!CallDescrWorkerWithHandler
mscorwks!MethodDesc::CallDescr
mscorwks!ClassLoader::RunMain
mscorwks!Assembly::ExecuteMainMethod
mscorwks!SystemDomain::ExecuteMainMethod
mscorwks!ExecuteEXE
mscorwks!CorExeMain
This is more interesting. This thread is currently waiting for a lock. This is where we deviate from previous examples. The kb command would give the culprit away because the address of the sync block is passed to the AwareLock::EnterEpilog method as a parameter. Instead, let’s examine the code that calls JIT_MonEnterWorker_Portable and try to figure it out from there:
0:000> !name2ee *!*Program.Main
…
Module: 000007ff000533d0 (LocksAndLocks.exe)
Token: 0x0000000006000001
MethodDesc: 000007ff00053990
Name: LocksAndLocks.Program.Main(System.String[])
JITTED Code Address: 000007ff001a0120
0:000> !u 000007ff001a0120
Normal JIT generated code
LocksAndLocks.Program.Main(System.String[])
Begin 000007ff001a0120, size 12f
…
…mov rax,qword ptr [rbp+8]
…mov rax,qword ptr [rax+8]
…mov qword ptr [rbp+30h],rax
…mov rax,qword ptr [rbp+30h]
…mov qword ptr [rbp+10h],rax
…mov rcx,qword ptr [rbp+30h]
…call mscorwks!JIT_MonEnter (000007fe`f7d8bc60)
…nop
…nop
OK, so the sync object’s address is passed in the RCX register. However, before it goes into the RCX register, which is probably cleared out by now, it’s fetched from the stack at RBP+30. This is something we have a shot at recovering:
0:000> kn
# … Call Site
…
07 … mscorwks!CLREvent::WaitEx+0xbe
08 … mscorwks!AwareLock::EnterEpilog+0xc9
09 … mscorwks!AwareLock::Enter+0x72
0a … mscorwks!JIT_MonEnterWorker_Portable+0xf5
0b … 0x7ff`001a01fa
0c … mscorwks!CallDescrWorker+0x82
0d … mscorwks!CallDescrWorkerWithHandler+0xd3
0e … mscorwks!MethodDesc::CallDescr+0x24f
0f … mscorwks!ClassLoader::RunMain+0x22b
…
0:000> .frame /r 0b
0b 00000000`0013ee10 000007fe`f7d8d502 0x7ff`001a01fa
rax=0000000000c0e100 rbx=000007ff0005c050 rcx=0000000000c0e100
rdx=0000000000000001 rsi=000000000013f020 rdi=000000000013ee98
rip=000007ff001a01fa rsp=000000000013ee10 rbp=000000000013ee30
r8=0000000000c38420 r9=0000000000000000 r10=0000000000000000
r11=0000000000000202 r12=0000000000000001 r13=0000000000000000
r14=000000000000001d r15=0000000000000001
iopl=0 nv up ei pl zr na po nc
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000244
000007ff`001a01fa 90 nop
0:000> dq 000000000013ee30+0x30 L1
00000000`0013ee60 00000000`02683a98
Voila—this is the sync object we saw earlier in the !SyncBlk output (if you don’t believe me, scroll up and see for yourself). Now we know that thread 0 is waiting for this synchronization object that’s held by thread 3, and thread 3 is in an infinite sleep.
This concludes yet another way to determine the synchronization object for which your thread is waiting.
On Friday I presented two sessions at the MCT Virtual Summit 2010, a Microsoft event for Microsoft Certified Trainers and Educators. Both sessions were train-the-trainer presentations for the courses Sela has in the Microsoft Courseware Library, but I decided to spice things up a little bit by adding a couple of demos to each of them.
The first session was about the Sela courses 50150 and 50166—C# 3.0 and Programming the .NET Framework 3.5—as well as an introduction to Parallel Programming with Visual Studio 2010 and .NET 4.0. I showed a couple of demos from my Microsoft DevAcademy 4 session.
The second session was about the Sela course 50153—.NET Performance—as well as a couple of demos from this course dealing with local GC roots, subtle finalization problems, and references between generations—topics that this course exhaustively covers.
If you’re a summit attendee and was unable to join the session, you should be able to view the session recordings at the conference website (using LiveMeeting).
Thanks for attending, and if you have any questions about these courses or any other Sela courses (on the CWL or not, for that matter), feel free to ping me through the contact form or use the Sela website.
[Update: My session recordings are publicly available using LiveMeeting—first session, second session.]
I just received a letter notifying me that I was awarded the Microsoft MVP award for Visual C# for 2010. I’m very honored to receive the award (the second time). Here’s to hoping that 2010 will be a good year for C# and anything else technological that I’ve been talking about on this blog, in online communities, at UG meetings, and at various conferences!
I would never have made it without the help and support of several people I would like to call out here, as well as many friends and colleagues not listed here:
- David Bassa, Erez Fliess, and Ishai Ram, my managers and friends at Sela whose endless patience, support, and willingness to cooperate with my craziness have been vital to my overall happiness and my technical achievements;
- Alon Fliess, who continues to be my role model as a technical leader, CTO, well-rounded guru, and good friend;
- Guy Burstein, a good friend from Microsoft who kept the Israeli developer community alive and kicking during the past few (difficult) years, and who is constantly coming up with great new ways to involve more and more developers in online and offline activities.
And last but not least, thank you, dear readers, for following me during the last year and for your emails and comments.