The IEnumerable<T> interface represents a collection of objects of type T, and is used heavily thanks to the C# foreach construct. Better yet, in the LINQ world, this is the interface that is “extended” via extension methods by the System.Linq.Enumerable class. This makes IEnumerable<T> both easy to use as well as powerful. But is it the best interface for getting data out of a possible collection?
Recap: what is IEnumerable<T>?
IEnumerable<T> has only one method: GetEnumerator(), that returns an enumerator (what else?), sometimes called an iterator.
IEnumerable<T>
- public interface IEnumerable<out T> : IEnumerable {
- new IEnumerator<T> GetEnumerator();
- }
This enumerator implements the IEnumerator<T> interface. This one inherits from the old .NET 1.x IEnumerator interface, like so:
- public interface IEnumerator {
- bool MoveNext();
- void Reset();
- object Current { get; }
- }
-
- public interface IEnumerator<out T> : IDisposable, IEnumerator {
- new T Current { get; }
- }
Disregard the “new” and “out”, as they are not important for this discussion.
So what does foreach do? It calls MoveNext() in a while loop as long as it returns true, and uses the Current property as the value used with the loop body. This is a pull model, each call to MoveNext() “pulls” the next data item, to be available via the Current property.
(the Reset() method is not typically used, so we’ll skip that one).
So what’s wrong with this model? It turns out this is a great model, as long as we don’t have to wait significantly for the next item to arrive. If it may take a while, or even if there never is a next item, we’re in trouble. MoveNext() will block. For many scenarios this is unacceptable, or at least undesirable.
Let’s turn things around
If this is a pull model, why don’t we try a push model? A client says to a provider of data: call me when you have something; otherwise, don’t bother me. if there is never a new item, nothing is wasted and no blocking is necessary. Effectively, we’re working asynchronously, or at least we can have the opportunity to do so.
Let’s try to reverse both interfaces, from a pull to a push model. First is IEnumerator<T>: We need to “reverse” MoveNext() and Current.
MoveNext() returns bool, so its reverse would be a method that accepts a bool.
Current is a read only property, so its “reverse” would be a write only property.
Let’s build our new interface (let’s call it IPusher<T> for now):
- public interface IPusher<T> {
- void MoveNext(bool ok);
- T Current { set; }
- }
MoveNext here seems redundant – it has no added value. It’s used to indicate completion in IEnumerator<T>, but it really isn’t appropriate here. We need another method to signal that there is no more data:
- public interface IPusher<T> {
- void OnCompleted();
- T Current { set; }
- }
Write only properties are usually frowned upon, so let’s turn this into a method called OnNext:
- public interface IPusher<T> {
- void OnCompleted();
- void OnNext(T value);
- }
How would we be notified of an error condition? With IEnumerator<T>, MoveNext() can throw an exception. In the push model, we need a method to indicate that:
- public interface IPusher<T> {
- void OnCompleted();
- void OnNext(T value);
- void OnError(Exception error);
- }
Great! We have a client interface that can be notified when data is available, when it’s all done and if there is any error. This interface, in fact, exists in .NET 4, called IObserver<T>:
- public interface IObserver<in T> {
- void OnCompleted();
- void OnNext(T value);
- void OnError(Exception error);
- }
What now?
We have an observer; now we need something to observe; something that provides data. This is exactly what IEnumerable<T> signifies in the pull model. Now we need to build its twin in the observer world – an observable.
For all of you design patterns buffs – this is the famous “observer” pattern classically defined by the “Gang of Four” (GoF) in their 1994 famous book. In that pattern, there is a Subject that is the source of the data, on which a pair of methods, Attach and Detach are defined, allowing an observer to subscribe and unsubscribe. We need something similar in our definition of the IObservable<T> interface:
- public interface IObservable<T> {
- void Attach(IObserver<T> observer);
- void Detach(IObserver<T> observer);
- }
This is certainly possible, but maybe we can remove the Detach method, by returning something from Attach that can be used to unsubscribe:
- public interface ICookie {
- void Detach();
- }
-
- public interface IObservable<T> {
- ICookie Attach(IObserver<T> observer);
- }
We just need a detach, perhaps we can leverage something already existing in .NET and not invent yet another interface. Hint: IDisposable.
Here’s the final version, renaming Attach with Subscribe, and using IDisposable.Dispose to unsubscribe:
- public interface IObservable<T> {
- IDisposable Subscribe(IObserver<T> observer);
- }
This is exactly how the real IObservable<T> is defined in .NET 4.
Again, what now?
Now we have a generic mechanism to get data out of some provider via push. We subscribe to the provider with an implementation of IObserver<T> and will get data when it’s available.
.NET 4 has no implementations for IObservable<T> nor IObserver<T> – but the relatively recently released (well, actually, a first service pack has already been released) of the Reactive Extensions (Rx) provides various implementations for both interfaces. Coupled with LINQ support, this is one powerful library.
Hopefully I will discuss Rx in future posts. For now, you can go ahead and download it, look at samples, docs, etc.