Waltzing Through the Parallel Extensions June CTP: Tasks
In the previous post in this series, we have looked at the new synchronization primitives offered by the PFX June CTP. In this post, we will look at task-related features and at the new task scheduler.
Task Continuation
Another interesting feature in the CTP is the task continuation paradigm, allowing us to specify what should happen when a task completes. This is accomplished through the use of the ContinueWith method on the Task and Future classes. Among other things, this mechanism can be used for chaining multiple asynchronous operations in an ordered pipeline of execution. Since most of the PFX is focused on out-of-order execution of independent work items, this is a welcome addition that streamlines pipeline processing of dependent tasks that should still utilize multiple processors. The difference between the two paradigms is best illustrated by the following diagram:
For example, the following code schedules a pipeline of three dependent work items. Each work item depends on the execution result of the previous work item, producing a single final value:
Future<int> f = Future.Create(() => 5)
.ContinueWith(a => a.Value - 1)
.ContinueWith(b => b.Value - 1);
Console.WriteLine(f.Value);
Task.WaitAny
In the case of multiple tasks (or futures) executing concurrently, it's possible that we're interested in the execution result of only one of them. The classic example is having several algorithms that could compute a result, but not knowing in advance which algorithm will be the fastest to compute it. The following example launches multiple calculations at once, and waits for the first one to complete. When it completes, the rest of the calculations can be canceled.
Future<int>[] calculations =
new Future<int>[] {
Future.Create(() => 5),
Future.Create(() => 6),
Future.Create(() => 7)
};
int calcIndex = Task.WaitAny(calculations);
Array.ForEach(calculations, c => c.Cancel());
Console.WriteLine(calculations[calcIndex].Value);
The New Scheduler
This CTP features a new revamped scheduler that is used by the Task Parallel Library (TPL) and Parallel LINQ to schedule work items for execution. This scheduler is by and large undocumented, and consists of dozens of internal classes that strive to perform cooperative scheduling in user-mode, without resorting to the operating system or to the .NET thread pool. This has its advantages (potentially, could be lightning-fast) but also has its disadvantages. For example, blocking tasks can potentially result in a scheduler thread exhaustion, rendering additional tasks unschedulable. This is a classic concurrency scenario familiar to anyone who ever tried to implement a thread pool: If there is a dependency between work items that are waiting for execution and work items that are already executing, then an unresolvable deadlock might occur. E.g., consider the scenario where I have 4 thread pool threads executing 4 distinct work items. After performing their work, these work items block waiting for a fifth work item to complete - but for that fifth work item to complete, it must be scheduled for execution, and it can't be scheduled as long as the 4 threads are blocking on the previous work items. The .NET thread pool can alleviate such scenarios by dynamically expanding the pool of worker threads - expect this to be addressed in future releases of the PFX as well.
The underlying scheduler ought to be discussed in more detail sometime in the future, where more relevant information becomes available.