Static vs. Instance (vs. Extension)

March 9, 2012

no comments

Sometimes I teach a basic .NET & C# course. Among many other things, I discuss arrays. I mention that all .NET arrays derive from System.Array, and so get some functionality for free, such as sorting. Here’s a simple array:

  1. int[] a = new int[10];

Now, the inexperienced student may type “a.”, opening the intellisense list box, and look for a method named Sort – after all, the instructor (me) said arrays support such an operation. The confused student can’t find any such method. Of course the problem is that sorting is implemented as a static method; Intellisense is strict – it does not offer the static method as alternative. The correct call would be:

  1. Array.Sort(a);

This is somewhat bewildering. Why is it not an instance method, if the method accepts the actual array? This confusion is compounded by the fact that types such as ArrayList and List<> offer an instance Sort method and not a static one. Clearly, design decisions are at play.

System.Array is an extreme example, having many static methods that accept the actual array as the first argument: Sort, Resize, Copy, BinarySearch, Clear, Find, to name a few.

I feel that methods like Sort and Clear should be instance methods, as they clearly make modifications to state. Methods such as Resize and Copy can be made static, as they operate on a “higher level” of sorts. Still, it’s all debatable.

Let’s look at other examples.

DateTime

DateTime is another interesting case. It’s a value type and it’s immutable, meaning its internal state cannot change after creation. This is often a desirable trait (I won’t delve into immutability in this post), especially for value types. The most famous immutables in .NET are String and all delegates.

However, it’s easy to forget that immutability of DateTime. I remember a while back I wanted to do something for a range of dates like so (assuming start is less than end when entering the method):

  1. static void DoWork(DateTime start, DateTime end) {
  2.     while(start < end) {
  3.         // do actual work…
  4.         start.AddDays(1);
  5.     }
  6. }

This of course didn’t work as expected. It’s an infinite loop, but it’s easy to miss. AddDays (and other such methods) are instance methods but return a new object (DateTime wants to be immutable). This is the correct code:

  1. static void DoWork(DateTime start, DateTime end) {
  2.     while(start < end) {
  3.         // do actual work…
  4.         start = start.AddDays(1);
  5.     }
  6. }

Maybe AddDays should have been defined as a static method. That would make it more intuitive as to the result. A static method looks more “distant”, as if looking from high above on the actual object:

  1. start = DateTime.AddDays(start, 1);

Wouldn’t it that be better?

The basic problem is consistency. Some types do behave that way. Consider BigInteger. It’s a value type and immutable as well. All operations on BigInteger are static, indicating more clearly that a new result is returned. Here’s an example:

  1. BigInteger n = BigInteger.Pow(2, 128);
  2. BigInteger m = BigInteger.Divide(n, 6);
  3. BigInteger q = BigInteger.Add(m, n);

The Complex value type behaves much the same way.

What about non-immutables?

A few months ago I implemented a mathematical library that included types such as Vector and Matrix, representing those mathematical entities with arbitrary size. Although I like immutability, I decided against it. The problem is that if a simple calculation is attempted, such as multiplying every matrix element by some constant, creating an entire new matrix for this just to maintain immutability may be too expensive, as the matrix may be arbitrarily large. Ideally, I would like the client to decide whether to modify the matrix or create a new one.

The problem I ran into was this: how can I support both self changing operators and non-changing operators (meaning they return new objects) in a consistent way? Here’s a basic Matrix class that I would like to have:

  1. class Matrix {
  2.     double[,] _values;
  3.     
  4.     public int Rows { get; private set; }
  5.     public int Columns { get; private set; }
  6.     
  7.     public double this[int row, int col] {
  8.         get { return _values[row, col]; }
  9.         set { _values[row, col] = value; }
  10.     }
  11.  
  12.     public Matrix(int rows, int cols) {
  13.         Rows = rows; Columns = cols;
  14.         _values = new double[rows, cols];
  15.     }
  16. }

Now suppose I want to allow adding one matrix to another:

  1. public static Matrix operator +(Matrix a, Matrix b) {
  2.     // check for compatible matrix ommited
  3.     var result = new Matrix(a.Rows, a.Columns);
  4.     // do actual calculation
  5.     return result;
  6. }

This clearly returns a new matrix, as is expected from operators. Good thing they are static – this makes it understandable. Sure, I could have added matrix b to a and return a itself. That meant modifying a, which the client would not expect.

How can a client change the actual matrix a? A client could write:

  1. a = a + b;    // or a += b;

This looks easy enough, but a new matrix was created, making a point to that new matrix. The previous matrix is no longer referenced – making it eligible for garbage collection. Clearly, a non-immutable approach is called for.

The solution I chose was to create two Add methods. one static and one instance. This conveys (at least for me) the correct intent:

  1. public static Matrix Add(Matrix a, Matrix b) {
  2.     return a + b;
  3. }
  4.  
  5. public Matrix Add(Matrix b) {
  6.     // add b to this
  7.     return this;    // for convenience
  8. }

Operators are considered “syntactic sugar”, so cannot be the only way to do operations – there should always be an equivalent method.

More matrix operations

The arithmetic operations were solved as described, as well as other common matrix operations, such as Transpose and Inverse. I ended up creating two separate methods, one static and one instance. The instance one changes the contents of this, while the static one returns a new object.

Another thing I needed is the ability to solve a linear equation system. This is a well known problem with well-known solutions and is not interesting for this discussion. The question is, where should I put this capability? Should it be part of the Matrix class, as a matrix is the primary component for this problem?

At first, the answer seems yes, but on second thought this seems awkward. Maybe in the future a better algorithm could be used? Should the Matrix class be changed because of that? Moreover, there are other algorithms involving matrices, should every such algorithm be placed in Matrix?

A possible solution may be to derive a new class from Matrix and implement those algorithms there. But this is counter-productive. Why would anyone use the original Matrix, when the enhanced Matrix offers features that may be required later? Inheritance is not intended for such cases.

The solution I used is to utilize an extension method. That means, a separate static method is written, which can be “imported” by referencing the assembly it’s defined in and making the namespace visible. The compiler would allow using the algorithm as though it’s on the Matrix, but in actuality it’s not. It’s a static method after all. This is somewhere between an instance method and a normal static method:

  1. static class MatrixExtensions {
  2.     public static Vector SolveLinearSystem(this Matrix m, Vector a) {
  3.         Vector result = null;
  4.         // solution omitted for brevity
  5.         return result;
  6.     }
  7. }

  1. Matrix m = …;
  2. Vector v = …;
  3. Vector solution = m.SolveLinearSystem(v);

One example of this in .NET is in the Managed Extensibility Framework (MEF). The usual way to get a part instance (MEF terminology for a satisfied object) is by calling the CompositionContainer.GetExportedValue<> instance method. However, sometimes you have an object that was created by other means, but its [Import](s) aren’t satisfied. In this case you call the container’s ComposeParts instance method, but this method is actually an extension method. Someone thought it would be inappropriate for the container to expose that kind of functionality directly, but recognized the need; the result – extension method (implemented by AttributedModelServices class, which is filled with extension methods).

Another example of instance vs. static

One obvious example that comes to mind is AppDomain.Unload. This method is static, but accepts the AppDomain reference as its first argument. It would seem that this should have been an instance method: you have an AppDomain reference, and you instruct it to unload. Why is it static? One reason might be symmetry in respect to the creation of an AppDomain which is done via the static AppDomain.CreateDomain method (no way to create an AppDomain using plain new). This raises another question: why is an AppDomain not created with a simple new, but uses a factory instead? We’ll get back to that in a moment.

The other possible reason is to make the unloading of an AppDomain seem “distant”, perhaps accomplished by an unknown entity. it’s also possible for code calling Unload to unload the calling AppDomain.

Object creation

Let’s get back to the AppDomain case and creation. Why is AppDomain created by a factory method? With AppDomain, the reason is that what you get back is actually a proxy to an AppDomain object (AppDomain derives from MarshalByRefObject). It’s not possible to get that using plain new. new always creates an object in the calling AppDomain (no proxy).

The AppDomain case is rare. More commonly, a factory is used instead of a plain new for two reasons:

1. A constructor overload is required, but there is no good way to distinguish between those constructors. Consider DateTime. It has some reasonable constructors, that take in things like year, month, day, hour, minute, etc. However, it also has some factory methods that take an argument of type long; there’s no way to distinguish that with constructors, so you have DateTime.FromFileTime, DateTime.FromFileTimeUtc, DateTime.FromBinary – all taking long as argument.

2. There is a need to get back a derived type and not the actual type. When we call “new Something”, we get back exactly a Something. Sometimes this is not desirable, and even not possible if that “Something” is abstract, for instance. This is one of the powers of a factory – you can get back a derived type, although most of the time you should not care, because the base type may expose enough functionality. The first example that comes to mind is the WebRequest class. Its constructors are protected. It has a main creation static method, Create. Create receives a URI and returns a derivative of WebRequest (WebRequest is abstract). For example, passing a URI starting with “http://” returns a HttpWebRequest instance. Passing a URI starting with “ftp://” returns a FtpWebRequest; and so on. It’s even possible to register new schemes to get a similar behavior.

Conclusion

Selecting a method to be static or instance is easy – most of the time. but there are cases where it’s not so obvious. And sometimes there is no right or wrong – it may be a matter of taste, or conventions employed by the development team. Extension methods can serve as a compromise, that is still easy to use like an instance method, but is implemented elsewhere as a static method.

Add comment
facebook linkedin twitter email

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*