Practical Concurrency Patterns: Thread-Local Cache

July 11, 2008

no comments

In the previous post we have looked at the fairly simple implementation of a read-write cache, where elements are created at the moment they are needed.  We’ve also noticed that for the specific cases where the data is accessed very frequently and calculating the data doesn’t cost too much in terms of CPU time and memory, there’s a large overhead to this approach because access to the cache must be synchronized.

We can introduce a simple optimization by noticing that locks are unnecessary if each worker thread gets its own copy of the cache.  This proves very efficient if the cache size isn’t very large, and the cache hit ratio is very high (e.g. after having created 100 elements, there are 10,000 accesses to these elements).

imageimage 

There are multiple ways of accomplishing this with the Parallel Extensions, one of them is an overload of Parallel.For which gives us the ability to specify a thread-local state:

Parallel.For(1, 11,

    () => new ReadWriteCache<int, int>(),   //Thread-local

    (i, loc) =>

    {

        for (int k = 0; k < 100; ++k)

        {

            Thread.Sleep(k%i);

            Console.WriteLine(

                loc.ThreadLocalState.GetOrCreate(k%i, x =>

            {

                Console.WriteLine(“CALCULATING {0}”, x);

                return x * x;

            }));

        }

    });

This is a fairly limiting approach, however, because we have to use Parallel.For and it doesn’t feel natural to launch our worker threads in this way.

Giving each thread its own copy of the cache in .NET is as easy as using the [ThreadStatic] attribute.  We can devise a static class that will contain the thread-local data for us, and then use it instead of the shared ReadWriteCache instance:

public static class ThreadContext

{

    public static TLSItems Current

    {

        get

        {

            //No lock needed here

            if (_current == null)

                _current = new TLSItems();

            return _current;

        }

    }

 

    [ThreadStatic] private static TLSItems _current;

}

 

public class TLSItems

{

    public ReadWriteCache<int, int> Squares =

        new ReadWriteCache<int, int>();

}

We can add more items to the TLSItems class as necessary.  This is easily usable without any initialization from the worker threads:

Parallel.For(1, 11, (i) =>

{

    for (int k = 0; k < 100; ++k)

    {

        Thread.Sleep(k % i);

        Console.WriteLine(

            ThreadContext.Current.Squares.GetOrCreate(

                k % i, x =>

                {

                    Console.WriteLine(“CALCULATING {0}”,x);

                    return x * x;

                }));

    }

});

Each thread is going to get its own cache, and perform all the work against that cache.  This is an excellent solution if we have a limited number of controllable worker threads, but what if we’re using a thread pool?

This technique is dangerous and inapplicable when working with thread-pool threads.  We have no control over the number of these threads (which means we might have 500 copies of the cache polluting our process’ memory), and it’s poor etiquette to change the TLS of a thread-pool thread and then return it to the pool because of the possible side-effects (there might be other pieces of code in the process using these threads for their own purposes!).

Therefore, the thread local cache optimization trick doesn’t apply to pool threads.  In the next installment we will look at a different optimization technique that will hold for pool threads as well as regular worker threads we create.

Add comment
facebook linkedin twitter email

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*