Demystify Async and Await (Part 1 of 2)
After seeing so many confusion speed abroad the industry,
I had came to conclusion to write this post and demystify
the behavior of async and await.
To many people is having to many wrong conceptions and
theories about the functionality of async and await.
It lead to great confusion and bad practices.
In order to explain async and await I must go back to
the basic and demystify Task.
Part 1 of this 2 part series will explain what really is Task and
the confusion almost everybody in the industry has on its real nature.
Part 2 of this series will demystify async and await.
I strongly recommend to read part 1 before part 2.
So what is Task?
Task is nothing more than a data structure.
Really, I’m not joking.
As simple as it seems, this is where its strength come from.
Many developers are thinking of Task as Thread, but this is
wrong conception which lead to many mistakes.
Task is nothing more than a data structure, which represent the context of execution.
It have the following structure:
– Status (the status represent the Task’s stage which can be WaitingToRun, Running, RanToCompletion, Canceled or Faulted)
in general it represent different stage of the execution.
– Pre execution data:
* AsyncState: arbitrary object which can be store on construction time
* Delegate: a method which may be invoke by a task’s scheduler.
The decision of when and how to invoke the delegate
is usually taken by the Task Scheduler (which is external to the Task)
and never taken by the Task itself.
In matter of fact in some cases the delegate won’t contain value at all.
* CreationOptions: some flags which may read by the Task Scheduler.
– Post execution data:
* Result (in case of Task<T>): hold the execution result in case of success.
* Exception: Faulted state information (in case that the Task represent context of
actually Task can represent failure of execution without ever running nor having a delegate.
Why is it so confusing, Why Task do seem like a Thread?
The code snippets ahead will use the following method in order of
writing the Thread information to the console..
It is so confusion because at first glance, Task functionality do seem equivalent to Thread (or Thread Pool).
considering the following code snippet this is a most logical conclusion.
Considering the code above, you’re likely to come to conclusion that Task equals Thread.
so why do I argue that Task is not equivalent to Thread?
Like a good magician Task is hiding something which cause the illusion of Thread duality.
It’s happen that Task’s overloads shown at the above snippet is actually hiding the Task Scheduler.
When you don’t pass Task Scheduler to a Task, you’re actually using TaskScheduler.Current
(not TaskScheduler.Default which lead to different confusion, which I may write about on future post).
The following snippet is showing the missing piece:
Lines 2,5,11 suggest that the execution of the Task many not be its own responsibility at all.
The scheduler is what actually running the Task.
The Task responsibility is to reflect the execution context.
It may represent execution aspects like:
– Has it run?
– Does it complete and having value?
– Did it failed and having exception?
– Did it cancelled?
Line 2 may run the Task’s delegate on the thread pool while
line 5 may run it on a new thread (not pooled).
The Task Scheduler can read the Task’s data and decide to open new thread rather than
thread pool (for task that marked with TaskCreationOptions.LongRunning).
Furthermore Task is data structure and it many not be running at all.
Take a look on the following code snippets.
All the Task above don’t even have any valid delegate, They’re only indicating
the execution state, even though non execution happened at all.
What’s so special about Task being a data structure?
Being a data structure enable to use the Task in broader context, much beyond simple threading.
For example, take common cache scenario where you have to do real asynchronous call
on the first time and return cached data on the following call (until the data became stale).
Task as data structure help you to abstract this functionality from the client.
The following snippet demonstrate the concept:
On the second call, GetData will be execute synchronously, yet it still return a Task.
The client don’t have to be aware of the caching implement within the method.
Parent / Child semantic is another feature which Task as data structure sine on.
Traditionally if you start new thread from a running thread, it don’t have any affinity to the original
thread it was start from. The new thread is a new execution root which isn’t aware of its origin.
On the other hand, Task does enable Parent / Child semantic. because Task is data structure, it can
maintain the affinity to child Tasks.
In matter of fact, Task don’t limit you to direct descendants.
You can have as many descendant’s level as you like, parent / child / grandchild / etc..
The following code demonstrate the idea:
The Tasks at lines 3,5 attaching to their parents (lines 10, 8).
This mean that the ContinueWith at line 14 will trigger only after 1500 milliseconds (which is the longest
duration of on of the Task descendent ).
Another benefit of the data structure shown on the above code snippet is the ability to scheduler
conditional execution. As you remember Task don’t responsible of its execution. It’s the responsibility of
the TaskScheduler to do so. On line 15, we schedule conditional continuation.
The TaskScheduler will check the state of the Task when it complete and schedule the continuation
only if the origin Task is in faulted state.
Be aware that replacing Task.Factory.StartNew with Task.Run on the above snippet won’t
be a good idea, because Task.Run will deny attaching of child Task (more about the
difference between Task.Run and Task.Factory.StartNew on future post).
Task is the foundation for modern async execution on .NET.
This post demystify Task while the next post will goes one step further
and demystify async / await.