DCSIMG
May 2010 - Posts - All Your Base Are Belong To Us

All Your Base Are Belong To Us

Mostly .NET internals and other kinds of gory details

May 2010 - Posts

Transactional Workflows: Suspend-Enqueue-Unload-Resume Done Correctly on Second-Phase Commit

More than two years ago I visited the subject of transactionally delivering a message to a workflow, making sure that the transaction did not commit until the message has been delivered and the workflow persisted under the same transaction. This subject has also been covered fairly well in this MSDN Forums thread.

As a quick reminder, if you want to transactionally deliver a message to a workflow, you need to follow these steps:

  • Suspend the workflow instance
  • Enqueue the message to the workflow’s queue
  • Unload the workflow, thus persisting it within the context of the ambient transaction
  • Resume the workflow

This last step must not execute under the same transaction, at least when using WF 3.5 and the default SQL persistence service, or else you would encounter an SQL timeout exception in the stored procedure that attempts to update the workflow state when resuming it. This code will not work:

WorkflowInstance instance = …;
using
(TransactionScope tx = new TransactionScope())

{

    instance.Suspend("Suspending to enqueue work");

    instance.EnqueueItem("MyQueue", "Hello World", null, null);

    instance.Unload();

    instance.Resume();

    tx.Complete();

}

If you run this sample with a default SQL persistence service, you’ll receive an exception:

Unhandled Exception: System.Data.SqlClient.SqlException: Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
   at System.Data.SqlClient.SqlConnection.OnError
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning
   at System.Data.SqlClient.TdsParser.Run
   at System.Data.SqlClient.SqlDataReader.ConsumeMetaData
   at System.Data.SqlClient.SqlDataReader.get_MetaData
   at System.Data.SqlClient.SqlCommand.FinishExecuteReader 
   at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds
   at System.Data.SqlClient.SqlCommand.RunExecuteReader 
   at System.Data.SqlClient.SqlCommand.RunExecuteReader
   at System.Data.SqlClient.SqlCommand.ExecuteReader 
   at System.Data.SqlClient.SqlCommand.ExecuteDbDataReader 
   at System.Workflow.Runtime.Hosting.PersistenceDBAccessor.RetrieveStateFromDB
   at System.Workflow.Runtime.Hosting.PersistenceDBAccessor.RetrieveInstanceState
   at System.Workflow.Runtime.Hosting.SqlWorkflowPersistenceService.LoadWorkflowInstanceState
   at System.Workflow.Runtime.WorkflowRuntime.InitializeExecutor 
   at System.Workflow.Runtime.WorkflowRuntime.Load 
   at System.Workflow.Runtime.WorkflowInstance.Resume

To address this, you might be tempted to write something along the following lines:

using (TransactionScope tx = new TransactionScope())

{

    instance.Suspend("Suspending to enqueue work");

    instance.EnqueueItem("MyQueue", "Hello World", null, null);

    instance.Unload();

    using (new TransactionScope(TransactionScopeOption.Suppress))

    {

        instance.Resume();

    }

    tx.Complete();

}

However, this is just another way of saying that you want the ambient transaction to wait until the Resume operation returns, but the Resume operation requires the locks acquired by the Suspend operation—in other words, you just created a deadlock, and you’ll get exactly the same exception.

The only appropriate way to resume the workflow is after the transaction completes. To be precise, if the transaction aborts, there’s no reason for you to resume the workflow because the Suspend operation would be rolled back as well (you can easily test this). If the transaction commits, however, you want to resume the workflow so that it can process the work item. Here’s how:

using (TransactionScope tx = new TransactionScope())

{

    instance.Suspend("Suspending to enqueue work");

    instance.EnqueueItem("MyQueue", "Hello World", null, null);

    instance.Unload();

    Transaction.Current.TransactionCompleted +=

    (o, e) =>

    {

        if (e.Transaction.TransactionInformation.Status == TransactionStatus.Committed)

        {

            instance.Resume();

        }

    }; 

    tx.Complete();

}

This works, and achieves the desired effect—if the transaction commits, the workflow is resumed. There is, unfortunately, a window of opportunity for failure between the time that the transaction commits and the time the Resume operation executes. If during the Resume operation there is a system error, the transaction would be deemed complete but the workflow would remain in a suspended state. Fortunately, this situation can be addressed by resuming workflows automatically (e.g. using filtering by suspension reason) when the system restarts.

Transaction Flows Across Reentrant AppDomain Calls

A few of weeks ago I blogged about flowing a transaction across AppDomain boundaries. Apparently, there’s a little gotcha that you need to be aware of: Even if you don’t flow the transaction to the other AppDomain, and the thread in the other AppDomain executes a method that calls back into the original AppDomain, the transaction will flow across the reentrant call.

In other words, if you have two objects, A and B, where A lives in one AppDomain and B lives in another AppDomain, then the following can happen: A calls a method of B within a transaction, but without flowing the transaction. The method of B executes without an ambient transaction. During the execution of that method, B calls a method of A without flowing the transaction. Nonetheless, the reentrant method of A executes within the same ambient transaction (which is not promoted to a distributed transaction).

This is actually quite reasonable, when you think about the way .NET Remoting works. The outgoing proxy remembers the call context that was used, and the incoming stub on the reentrant call restores the ambient transaction from that call context. Still, it can be quite misleading to see a completely different method of your class execute under the same ambient transaction from another method that didn’t ever call it directly.

Using NetModules for Reducing Compilation Costs

A relatively obscure feature of .NET deployment is the concept of a netmodule. An assembly can contain multiple netmodules, which can be compiled separately and relinked into the assembly without recompiling other netmodules that constitute the assembly. Effectively, netmodules form a linkage boundary for managed assemblies.

The Web is full of old samples from the pre-Whidbey era that show how to use netmodules, and I got an email with a question on the subject a couple of days ago, so here’s a trivial example that shows how you can use netmodules to reduce compilation costs. First of all, we have a few source files, let’s call them A, B, and C:

//A.cs
internal class A { public static void a() { } }

//B.cs
internal class B { public static void b() { } }

//C.cs
class C {
    static void Main() {
        A.a(); B.b();

    }
}

What we’d like to do is compile A.cs and B.cs into separate netmodules, and link them together with C.cs into a single executable program. Should A.cs ever change, we would like the ability to relink the resulting executable without having to recompile B.cs—and that’s one thing netmodules can do for you.

Unfortunately, there’s no easy way to use netmodules from Visual Studio, so I wrote you a makefile:

A.netmodule: A.cs
    csc /t:module /out:A.netmodule A.cs

B.netmodule: B.cs
    csc /t:module /out:B.netmodule B.cs

all: A.netmodule B.netmodule C.cs
    csc /t:exe /out:C.exe /addmodule:A.netmodule /addmodule:B.netmodule C.cs

clean:
    del A.netmodule
    del B.netmodule
    del C.exe

This makefile can incrementally generate A.netmodule and B.netmodule from A.cs and B.cs respectively, and link them with C.cs into C.exe when necessary. If only A.cs changes, there’s no need to recompile B.cs; if only C.cs changes, there’s no need to recompile A.cs or B.cs.

You can try this out yourself by creating the three files (A.cs, B.cs, C.cs) anywhere on your file system and then creating a makefile with the above contents. Next, open a Visual Studio command prompt, navigate to that directory, and run nmake all.

Assembly Private Bin Path Pitfall

When you configure a .NET AppDomain, one of the things you can set is the private probing path (relative to that AppDomain’s base directory) that will be used by the CLR loader when trying to load an assembly in that AppDomain. With that in mind, the following code is allegedly supposed to work, provided that the SecondAssembly assembly is indeed located in the PrivatePath subdirectory:

 class InOtherAppDomain : MarshalByRefObject 
 {
   public void SomeMethod()
   {
     AppDomain thisAD = AppDomain.CurrentDomain;
 
     Console.WriteLine(thisAD.BaseDirectory);
     Console.WriteLine(thisAD.RelativeSearchPath);
 
     string privateBinDirectory =
       thisAD.BaseDirectory + @"\\" + thisAD.RelativeSearchPath;
     Console.WriteLine(privateBinDirectory);
 
     Console.WriteLine(File.Exists(
       Path.Combine(privateBinDirectory, "SecondAssembly.dll" )));
 
     try 
     {
       Assembly.Load("SecondAssembly" );
       Console.WriteLine("Assembly loaded successfully" );
     }
     catch (Exception ex)
     {
       Console.WriteLine("Error loading assembly: " + ex.Message);
     }
   }
 }
 
 class Program 
 {
   static void Main(string[] args)
   {
     AppDomainSetup setup = new AppDomainSetup();
     setup.ApplicationBase = AppDomain.CurrentDomain.BaseDirectory;
     setup.PrivateBinPath = @"\\PrivatePath";
 
     AppDomain secondAD = AppDomain.CreateDomain(
       "SecondAD", AppDomain.CurrentDomain.Evidence, setup);
 
     InOtherAppDomain obj = (InOtherAppDomain)
       secondAD.CreateInstanceAndUnwrap(
         typeof(InOtherAppDomain).Assembly.FullName,
         typeof(InOtherAppDomain).FullName);
 
     obj.SomeMethod();
   }
 }
 

Well, this program’s output is in fact the following:

D:\Development\Scratch\AssemblyPrivateBinPath\AssemblyPrivateBinPath\bin\Debug\
\PrivatePath
D:\Development\Scratch\AssemblyPrivateBinPath\AssemblyPrivateBinPath\bin\Debug\\\PrivatePath
True
Error loading assembly: Could not load file or assembly 'SecondAssembly' or one of its dependencies. The system cannot find the file specified.

Hmm. The secondary AppDomain’s base directory looks all right, as does the private probing path. The combined directory looks weird (the three slashes) but File.Exists says the assembly is there, so what’s with the exception?

If you examine the FusionLog property of the FileNotFoundException, you’ll find the following text (edited for brevity):

=== Pre-bind state information ===
LOG: User = Sasha-PC\Sasha
LOG: DisplayName = SecondAssembly
(Partial)
. . .
LOG: Appbase = file:///D:/Dev/AssemblyPrivateBinPath/bin/Debug/
LOG: Initial PrivatePath = \PrivatePath
Calling assembly : AssemblyPrivateBinPath, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null.
===
LOG: This bind starts in default load context.
LOG: No application configuration file found.
LOG: Using host configuration file:
LOG: Using machine configuration file from C:\Windows\Microsoft.NET\Framework\v4.0.30319\config\machine.config.
LOG: Policy not being applied to reference at this time (private, custom, partial, or location-based assembly bind).
LOG: Attempting download of new URL file:///D:/Dev/AssemblyPrivateBinPath/bin/Debug/SecondAssembly.DLL.
LOG: Attempting download of new URL file:///D:/Dev/AssemblyPrivateBinPath/bin/Debug/SecondAssembly/SecondAssembly.DLL.
LOG: Attempting download of new URL file:///D:/Dev/AssemblyPrivateBinPath/bin/Debug/SecondAssembly.EXE.
LOG: Attempting download of new URL file:///D:/Dev/AssemblyPrivateBinPath/bin/Debug/SecondAssembly/SecondAssembly.EXE.

Hmm. It’s not even looking in the PrivatePath subdirectory—it’s doing the default bind and nothing else. Well, there might be some additional information and the only way to get it is to enable the Fusion log.

There are several ways to enable the Fusion log, the easiest one is probably using the Assembly Binding Log Viewer (run fuslogvw.exe from an elevated Visual Studio command prompt). After enabling it and running the application again, I got the following bind information in the log output for SecondAssembly (edited for brevity):

*** Assembly Binder Log Entry  (5/6/2010 @ 3:32:20 PM) ***
The operation failed.
Bind result: hr = 0x80070002. The system cannot find the file specified.
. . . (same as earlier)
. . .
WRN: Not probing location file:///D:/PrivatePath/SecondAssembly.DLL, because the location falls outside of the appbase.
WRN: Not probing location file:///D:/PrivatePath/SecondAssembly/SecondAssembly.DLL, because the location falls outside of the appbase.
WRN: Not probing location file:///D:/PrivatePath/SecondAssembly.EXE, because the location falls outside of the appbase.
WRN: Not probing location file:///D:/PrivatePath/SecondAssembly/SecondAssembly.EXE, because the location falls outside of the appbase.
LOG: Attempting download of new URL file:///D:/Dev/AssemblyPrivateBinPath/bin/Debug/SecondAssembly.DLL.
LOG: Attempting download of new URL file:///D:/Dev/AssemblyPrivateBinPath/bin/Debug/SecondAssembly/SecondAssembly.DLL.
LOG: Attempting download of new URL file:///D:/Dev/AssemblyPrivateBinPath/bin/Debug/SecondAssembly.EXE.
LOG: Attempting download of new URL file:///D:/Dev/AssemblyPrivateBinPath/bin/Debug/SecondAssembly/SecondAssembly.EXE.
LOG: All probing URLs attempted and failed.

Oh boy! It’s looking at D:\PrivatePath, which is outside the AppDomain’s base directory, and so it can’t be probed… But why would it go there if the directory is relative to the AppDomain’s base anyway??

Well, actually, I cheated here a little bit. When initializing privateBinDirectory in the above program, I didn’t use Path.Combine, but used direct string concatenation. And the reason is that if I used Path.Combine, the resulting combined directory would be . . . (dramatic suspense)

. . . (still dramatic suspense)

. . . (there could be more dramatic suspense here, but modern displays are really big and it’s really easy to scroll nowadays. Oh well, let’s give it away . . .)

\PrivatePath

Yes, that’s right, \PrivatePath. And thus the exception—if there’s a leading backslash in the beginning of the private probing path, the combined path is treated as an absolute path on the current volume, which is highly likely to fall outside the AppDomain’s base and cause this exception.