Workflow Services Limitations: Part 1 - External Transactions
In this open-ended series of posts, I would like to outline some limitations of the current Workflow Services implementation (.NET 3.5). The information presented here is the result of several months' work designing, prototyping and implementing an application framework on top of Workflow Services.
Please note: Some of these issues have been non-officially confirmed by Microsoft, some others are pending - so you should treat all of this with a grain of salt. None of this substitutes the existing Microsoft documentation.
If you're looking for a general overview of what Workflow Services are about, the following resources might be useful:
Workflows and External Transactions
In our architecture, there is a distributed transaction that starts at a node outside the workflow. This node then talks to the workflow, which has a ReceiveActivity to handle the request.
We need to flow the transaction from the remote node to the workflow instance. WCF gives us the necessary infrastructure for flowing the transaction, but the ReceiveActivity does not. The code of the activity and the parameter bindings are executed outside the transaction, without any transaction context, regardless of the transaction that flows with the WCF service.
This is in sync with the MSDN documentation, which clearly states that it’s impossible for a workflow to participate in a transaction that has started outside the workflow. For example, the default implementation for the WorkflowCommitWorkBatchService explicitly removes any ambient transaction that does not originate from within the workflow.
We have tried several ways to work around this:
- Provide an IDispatchMessageInspector extension which checks incoming messages for the TransactionMessageProperty and sets the ambient transaction if a transaction is present. This doesn’t work because the ambient transaction created in this step doesn’t flow into the workflow. (It’s either explicitly removed or simply not flowed by the scheduler service.)
- Provide a custom workflow scheduler service which schedules a thread for activity execution but first sets the ambient transaction if necessary. This doesn’t work because the workflow runtime created by the WorkflowServiceHost has an explicit validation behavior in place (in the WorkflowRuntimeBehavior) which makes sure that the scheduling service is an internal class called SynchronizationContextWorkflowSchedulerService.
This has been an interesting exercise in extending WCF and workflow, but these attempts failed. The workflow runtime itself uses a TransactionScope to suppress any external transactions.
What's Possible?
The only supported way of flowing a transaction into the workflow instance is during the unload (persistence point) of the instance. Here is the sequence of steps necessary to perform it:
- Within a transaction, suspend the workflow instance.
- Within a transaction, enqueue a message on the workflow instance.
- Within a transaction, call Unload on the workflow instance. The workflow runtime will use the ambient transaction to persist and unload the workflow.
- Commit the transaction.
- Resume the workflow instance. (This happens outside the transaction.)
This guarantees that the persistence and unloading of the workflow instance occur within the external transaction. This is the perfect solution for durable messaging, because it can be used to transactionally enqueue an external message onto the workflow and persist the workflow, so that the message becomes durable.
Workaround and What's Impossible
However, this does not provide the mechanism necessary for performing the work inside the workflow within the external transaction. From the moment the workflow has been persisted, the transaction is irrelevant. The next time the workflow is loaded and continues execution, it no longer participates in the external transaction.
In view of this limitation, the only applicable workaround is to have the workflow instance initiate the transaction (reversing the transaction root and planting it inside the workflow instance).
For example, instead of having the external node open the transaction and flow it to the workflow instance, the external node can signal the workflow that work is pending, outside a transaction. The workflow will then initiate a transaction and contact the external node within that transaction to pull whatever work is pending and execute the necessary operations within a transaction. This solution is associated with an extra roundtrip necessary to signal the fact that work is pending.
This workaround still doesn't alleviate the scenario where two distinct workflow instances need to communicate within a transaction. When the first workflow calls the second workflow, the transaction will be removed; if the second workflow initiates work within the transaction and calls the first workflow, the transaction will be removed nonetheless.