ValueType virtual methods and why avoid them
I've just stumbled upon Marc Brooks' interesting post explaining and comparing value types and reference types.
One of the points Marc makes is that any value type you'd write (i.e. a C# struct) would inherit from System.ValueType an implementation of the Equals and GetHashCode methods, with a default behavior suited for value types. That is, a behavior that compares the fields of the value type to determine equality or to calculate a hash code.
A critically important point that must be mentioned here is that ValueType.Equals and ValueType.GetHashCode come with an enormous performance penalty. Consider the following code:
struct MyValueType {
public int MyInt;
public float MyFloat;
}
// at a later point:
MyValueType v1, v2;
v1.Equals(v2);
// Signature: public override bool Equals(object o)
When we're to call the Equals method on the v1 instance, we are in fact dispatching a virtual method on a value type. For reasons beyond the scope of this post (but that are pretty much covered in this one), dispatching a virtual method requires the object's type object pointer (a.k.a. its method table) to find the method. Since a value type instance doesn't have a type object pointer, it must be "mini-boxed" to obtain that pointer so that the method can be dispatched. While this is not as costly as "true" boxing, this incurs the first performance hit (1).
When we're to call the Equals method and pass the v2 instance to it, we are passing a value type where an object is expected (note the signature of the Equals method). This means that the parameter is boxed, which incurs the second performance hit (2).
Finally, and most importantly, the implementation of the ValueType.Equals method that our value type inherits from System.ValueType is the following (from Reflector):
public override bool Equals(object obj) {
if (obj == null) return false;
RuntimeType type = (RuntimeType) base.GetType();
RuntimeType type2 = (RuntimeType) obj.GetType();
if (type2 != type) return false;
object a = this;
if (CanCompareBits(this))
return FastEqualsCheck(a, obj);
FieldInfo[] fields = type.GetFields(BindingFlags.NonPublic | BindingFlags.Public | BindingFlags.Instance);
for (int i = 0; i < fields.Length; i++) {
object obj3 = ((RtFieldInfo) fields[i]).InternalGetValue(a, false);
object obj4 = ((RtFieldInfo) fields[i]).InternalGetValue(obj, false);
if (obj3 == null) {
if (obj4 != null) return false;
}
else if (!obj3.Equals(obj4))
return false;
}
return true;
}
The key point to make here is this: if the internal call CanCompareBits returns true, the internal call FastEqualsCheck is invoked. It is possible to guess according to the names (or attempt to interpret some Microsoft documentation) that if the value types can be bitwise-compared (e.g. don't contain any non-primitives), they are bitwise-compared using memcmp. This can be partially confirmed by looking at the Rotor (SSCLI) implementation for these methods, bearing in mind that the actual CLR implementation might be totally different.
However, if the internal call CanCompareBits returns false, what happens is that we use Reflection to go over each field and compare them recursively. This is the really major point I wanted to make: if we get to this branch of code, comparing our objects suddenly becomes very very expensive! (I'll leave proving the same regarding GetHashCode as an exercise to the reader)
How can you prevent all of this? You could override the ValueType.Equals behavior, and you should do it. But you should also implement a separate Equals method that is strongly-typed and non-virtual, to prevent the previous two costs from being incurred. I.e.:
public /* not virtual! */ bool Equals(MyValueType /* not object! */ o);
To keep this post short, let's summarize the take-aways:
- Never rely on the ValueType implementation for Equals and GetHashCode
- Always override them in your class to save:
- The chance of Reflection being used for your type
- Provide another implementation of Equals that is strongly-typed and non-virtual to save:
- The boxing incurred on the parameter
- The "mini-boxing" incurred to dispatch a virtual method call on your value type instance