Practical Concurrency Patterns: Cyclic Lock-Free Buffer

August 2, 2008

2 comments

Last time we have minimized contention by using lock-free operations instead of acquiring a lock on a work item queue (neither a standard lock nor a reader-writer lock offer nearly-linear scaling).  On the other hand, a few weeks ago we’ve also seen a technique for eliminating contention altogether by keeping state in thread-local storage.

However, there’s another trick in the bag for using shared state across threads without any locks.  It is applicable for the scenario where multiple consumers “subscribe” to a work item, such that all work items must be processed by all consumers.  This is the classic publish/subscribe, within the scope of a single process and very high performance demands.

This pattern uses a cyclic buffer with a single counter indicating the next item to be consumed.  The producers increment the counter and subsequently update the previous item; the consumers keep a local counter which specifies the last item they have read.

class CyclicQueue<T>

{

    int _currentIndex = -1;

    int _size;

    volatile T[] _items;

 

    public CyclicQueue(int size)

    {

        _items = new T[_size = size];

    }

    public void Enqueue(T item)

    {

        int prevIndex = Interlocked.Increment(ref _currentIndex);

        _items[prevIndex % _size] = item;

    }

    public bool Dequeue(ref int threadPrivateIndex, out T value)

    {

        if (threadPrivateIndex == _currentIndex + 1)

        {

            value = default(T);

            return false;

        }

        value = _items[threadPrivateIndex++ % _size];

        return true;

    }

}

As long as consumers don’t lag behind too much as to cause an overflow with loss of data, this pattern provides lock-less operations.  The only synchronization required is around the single counter indicating the current index into the buffer, and this can get a bit tricky in the face of an overflow.  To simplify, we use counter % BUFSIZE whenever accessing the data instead of taking care of wrapping in the case of overflow when incrementing the counter.

image

Note that consumers in this case are required to busy-wait around the Dequeue operation as long as there is no new data in the queue.  If this busy-wait is known to be expensive, we can either resort to kernel synchronization or use a spinlock and fail back to kernel synchronization.  Using spinlocks to minimize contention and context switches will be the subject of our next installment.

Add comment
facebook linkedin twitter email

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*

2 comments

  1. LionelAugust 5, 2008 ב 1:41 PM

    Doesn’t that code have a race condition in. if you run

    int prevIndex = Interlocked.Increment(ref currentIndex);

    (which should probably be newIndex rather than prevIndex) and then somebody calls Dequeue before the line

    _items[prevIndex % _size] = item;

    gets run you will get some random stuff returned by the Deque since the index has been updated before the item has been stored.

    One more question which is a bit more obscure is do you need the volatile keyword against the array. I guess I am a bit unclear about exactly what the volitile keyword does when used with array but since the _items variable could be declared as readonly if would be legitimate for the compiler to inline access to the _items variable. If the volitile keyword marks all the items in the array as volitile then it does make sense. As far as I can work out from the documentation since _items is basically a pointer you are marking the pointer as volitile and since it never changes this seems a little pointless. Which bring me to another question which how do you mark all the elements of the array as being volitile or is this just not necessary.

    Lionel

    Reply
  2. qwOctober 17, 2008 ב 3:15 PM

    That code above really looks like jibberish. It is certainly incorrect and begging for race conditions

    Reply