Enumerator Structs: Optimizing for the Common Case

July 5, 2010

6 comments

I read a couple of posts by Simon Cooper where he explains in great details why mutable structs are bad, and how the BCL’s enumerator structs (e.g. LinkedList<T>.Enumerator) are prime candidates for this anti-pattern. He dissects a really nasty bug where declaring a value type field as readonly effects unexpected copy semantics if the value type is modified. The following code enters an infinite loop (which prints 0 indefinitely):

class EnumeratorWrapper
{
    private readonly LinkedList<int>.Enumerator _enumerator;
    private readonly LinkedList<int> _list;

    public EnumeratorWrapper()
    {
        _list = new LinkedList<int>(Enumerable.Range(0,5));
        _enumerator = _list.GetEnumerator();

        PrintAll();
    }

    private void PrintAll()
    {
        while (_enumerator.MoveNext())
        {
            Console.WriteLine(_enumerator.Current);
        }
    }
}

The reason is that if you declare the field readonly, the MoveNext call will be called on a copy of the field’s value, to prevent it from being modified. Admittedly, this seems like enough of a reason to ban mutable value types altogether, and the enumerator structs in particular.

So why bother with a value type? As the matter of fact, why bother exposing the type that implements the IEnumerator<T> interface, and not declare it a private class and return only the interface? In other words, what’s the difference between the following two approaches?

class MyCollection : IEnumerable<int>
{
    public IEnumerator<int> GetEnumerator()
    {
        return new MyPrivateEnumerator();
    }

    private class MyPrivateEnumerator : IEnumerator<int>
    {
        //Implementation omitted
    }
}
class MyOtherCollection : IEnumerable<int>
{
    public MyEnumerator GetEnumerator()
    {
        return new MyEnumerator();
    }

    IEnumerator<int> IEnumerable<int>.GetEnumerator()
    {
        return GetEnumerator();
    }

    public struct MyEnumerator : IEnumerator<int>
    {
    }
}

Well, as I (and many others) have written elsewhere, calling a method through an interface reference is more expensive than calling it directly. The primary motivation behind publicly exposing the enumerator type is that clients of the collection class can use it directly, without going through an interface reference.

Fortunately, the foreach statement uses duck typing to perform the enumeration on the collection—and the foreach statement is really the common case of using any enumerator at all. This means that in the following fragment, the public MyOtherCollection.GetEnumerator method is called, and the generated code uses the enumerator struct directly, and not through an interface reference:

MyOtherCollection collection = new MyOtherCollection();
foreach (int i in collection)
{
    Console.WriteLine(i);
}

Roughly equivalent to:

var enumerator = collection.GetEnumerator();
while (enumerator.MoveNext())
{
    int i = enumerator.Current;
   
Console.WriteLine(i);
}

That’s why there are no method calls through an interface here—the “var” is MyOtherCollection.MyEnumerator and not IEnumerator<int>! Moreover—and this is the critical point—if the enumerator implementation were to go through the same hoops of implementing the interface explicitly and at the same time providing public methods that do the same thing, then the MoveNext and get_Current method calls would also be direct calls (not through an interface reference), and therefore could be inlined. Obviously, the only chance of ever achieving performance with a foreach block over a collection that is comparable to that of a built-in array is if the collection accessors are inlined. And it’s very simple, indeed:

public struct MyEnumerator : IEnumerator<int>
{
    public bool MoveNext()
    {
        //Actual implementation here
    }

    bool IEnumerator.MoveNext()
    {
        return MoveNext();
    }

    void IEnumerator.Reset()
    {
        throw new NotImplementedException();
    }

    public int Current
    {
        get { /* Actual implementation here */ }
    }

    object IEnumerator.Current
    {
        get { return Current; }
    }
}

Now, if the caller is working with a variable of type MyEnumerator, they would access the critical methods directly, without an interface call, allowing for inlining to take place (not to mention that the integer will not be boxed when returning from the get_Current call). Still, if someone insists on using an interface to perform the enumeration, we implement our part of the contract using explicit interface implementation.

Finally, why bother with the value types? The probable reason is to avoid object allocations. Allocating a value type and returning it is significantly cheaper than allocating a heap object and reclaiming it later. Although this is not the critical performance improvement here (the explicit implementation trick is the crucial part), it’s still worthwhile.

Is it worthwhile, though, to introduce a mutable value type, which is a big no-no by all accounts, for this performance gain? I think it is. If the enumerator type were commonly used on its own, I wouldn’t agree—but this particular enumerator type is used, 99.99% of the time, in a foreach statement, where the developer is not exposed at all to its existence.

I think that implementing a mutable value type that’s invisible in the vast majority of development scenarios and that provides a non-negligible performance gain is an acceptable compromise.

Add comment
facebook linkedin twitter email

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*

6 comments

  1. Stephen ClearyJuly 6, 2010 ב 4:54 PM

    Excellent analysis! This is a very informative blog post.

    I do have one comment, though: as LINQ becomes more an more commonplace, the “99.99%” value decreases drastically. I tend to believe that this value would be accurate pre-3.5, but by now has been reduced considerably.

    Reply
  2. JuozasJuly 8, 2010 ב 12:29 PM

    I don’t see evidence that performance gains by using struct Enumerators are non-negligible. I don’t doubt that they are measurable, I just doubt they are significant enough. Thorough analysis otherwise.

    Reply
  3. Simon CooperJuly 16, 2010 ב 2:26 PM

    Thanks for the explanation Sasha. I do disagree with you that it is worth introducing a mutable value type to get this performance gain, simply as it is _so_ dangerous and hard to figure out what’s going on when you do run into problems. However, along with the Reset explicit implementations and the private SortedList< ,>.Enumerator struct, they can’t really be fixed now.

    I would say that the documentation should be altered to make it very clear that the enumerator structs are _not_ designed to be used outside foreach loops, that they only exist for performance optimization by compilers, and that they should only be used as a boxed IEnumerator by users.

    Reply
  4. David NelsonJuly 16, 2010 ב 7:30 PM

    I agree with Juozas, I haven’t seen any evidence that this particular “optimization” results in any significant performance gains; and they would have to be really high to outweigh the devastating effects if the enumerators are used in a way other than intended (which they definitely are, since I have previously seen this reported as a bug on Connect).

    Reply
  5. David PiepgrassApril 12, 2011 ב 7:07 AM

    As I recall (I can’t check because I haven’t bought a new copy of Reflector… y’know) the List.Enumerator structure is 16 bytes or greater, so there wasn’t much performance to gain by making it a struct in the first place. Now consider their strange decision to make Reset() an explicit interface implementation, which tends to cause people to write code like “((IEnumerator)e).Reset();” which doesn’t work.

    Still, I agree with your analysis. At the time List was designed, 99% of the time we used foreach, and making .NET competitive with C++ (unlike Java) is important enough to take some risks. Now that we use LINQ, the struct offers no performance gain, but it doesn’t harm performance either. The design of IEnumerator does that (requiring two interface calls per iteration? tsk, tsk, tsk.)

    Reply
  6. Internalized_PotatoJune 8, 2012 ב 12:37 AM

    To those saying the performance benefits aren’t worth it – most of the time you are likely correct. However, in realtime applications you want to try and achieve GC silence – i.e. try to avoid any allocations so that your application behaviour is predictable.

    You might say don’t use a garbage collected language for this, but its pro’s and con’s, and if the slight inconvenience of using mutable structs for this feature is one of your biggest con’s then the pro’s win out. If that makes any sense at all.

    Reply