Fairness is Highly Overrated
Fairness with respect to synchronization mechanisms is a highly overrated property. When I talk about concurrency, parallelism, Windows synchronization and similar subjects, I’m often asked whether the specific algorithm, mechanism or feature is fair in some respect.
First, let’s define fairness. I’ll use a simplistic yet rigorous definition to define a fair lock. (Other synchronization mechanisms may have fairness defined in a similar fashion.) To begin with, a lock is exactly what you think it is – a mechanism for specifying mutual exclusion. Given N threads that attempt to acquire the lock concurrently, only one thread may enter the lock at a given time. When that thread releases the lock, one other thread may enter the lock; and so on. We can assume for the sake of the definition that the work protected by the lock is bounded; i.e. no thread acquires the lock with the intention of retaining it indefinitely.
A lock is fair if, given N threads that attempt to acquire the lock, there is an upper bound on the amount of time any one of the N threads will wait before entering the lock.
An alternative definition considers the number of times a thread is skipped before it is allowed to enter the lock, and discards the need to put a bound on the duration of work protected by the lock:
A lock is fair, if given N threads that attempt to acquire the lock, there is an upper bound on the number of times any one of the N threads will be skipped before entering the lock.
This sounds like a fairly trivial property, and one of the most trivial ways to implement this kind of fairness is using a queue (FIFO) to maintain the information about waiting threads. If waiting threads are enqueued into a FIFO, and the work protected by the lock is bounded, then the amount of time any individual thread will wait is also bounded, and the bound is linear in the number of threads that arrived before that thread.
However, FIFO fairness is not always easy to attain. To begin with, some synchronization mechanisms don’t use queues of waiting threads because they are too expensive. For example, a spinlock, which is a purely user-mode synchronization mechanism requiring no system calls, does not use FIFOs of waiting threads at all, and hence is not fair. Extensions of spinlocks, such as queued spinlocks (MCS spinlocks) may provide fairness that is not necessarily linear. Peterson’s algorithm for mutual exclusion provides a quadratic upper bound on fairness, implying that a thread may be skipped O(N^2) times when N threads are completing for the lock.
Even when FIFOs are used, a thread may be forced to leave the queue and be inserted at the back of the queue later on (on Windows, this can happen when an APC is executed on that thread). If this can happen an unbounded number of times, fairness cannot be guaranteed at all – and indeed, no synchronization mechanism on Windows is free from the risk of being interrupted by an APC. If the thread enters an alertable wait state, it can be preempted by a user mode APC; but even if it doesn’t cooperate, kernel mode APCs can still interrupt the thread and effect a reordering of the waiting queues.
The most amazing thing of all about fairness is that it is not always a desired property even if it can be guaranteed with relative ease. Lock convoys are an excellent demonstration of how fairness can be devastating for a group of threads competing for the same synchronization mechanism. In a lock convoy (which I discussed in detail earlier), a group of threads acquires and releases a lock many times in a single quantum; however, when one of the threads is preempted while holding the lock, havoc ensues because any subsequent thread acquires the lock only once and is preempted as soon as it tries to acquire the lock again. This is a direct consequence of fairness, and demonstrates that fairness is not always desirable.
When designing a synchronization mechanism or working with existing synchronization mechanisms, bear in mind that often the progress of a group of threads (measured as throughput, CPU utilization or other metrics) can be shown to be higher if unfair synchronization mechanisms are used. A single thread can “take one for the team” to maintain consistent global progress.