This week I'll talk about what seems to me like the coolest features of C# 3.0: lambda expressions and expression trees.
Lambda Expressions
We'll start with the part that is easier to understand. Remember anonymous methods from C# 2.0? Useful little bastards they are, only sometimes not so pleasant on the eyes.
public void SortListIgnoreCase()
{
List<string> list = new List<string>{"abc", "ADE", "dol"};
list.Sort(delegate(string s1, string s2)
{
return s1.ToLower().CompareTo(s2.ToLower());
}
);
}
So here we are sorting a list by passing a delegate of type Comparison to the sort method, which compares the lowercase version of the strings. Thing is, the syntax is not the prettiest. Here's how you'll write in C# 3.0:
list.Sort((s1,s2) => s1.ToLower().CompareTo(s2.ToLower()));
The parameter here is called a lambda expression. We are doing the exact same thing as before, only now there are just less words. The compiler is now smart enough to infer the types for the parameters by itself. For anonymous methods that are short as this, I recommend that from now on you will use only this syntax.
How is it Implemented?
Well, as you would expect. Reflector shows:
1 public void SortListIgnoreCase()
2 {
3 List<string> <>g__initLocal1 = new List<string>();
4 <>g__initLocal1.Add("abc");
5 <>g__initLocal1.Add("ADE");
6 <>g__initLocal1.Add("dol");
7 List<string> list = <>g__initLocal1;
8 if (CS$<>9__CachedAnonymousMethodDelegate3 == null)
9 {
10 CS$<>9__CachedAnonymousMethodDelegate3 = delegate (string s1, string s2) {
11 return s1.ToLower().CompareTo(s2.ToLower());
12 };
13 }
14 list.Sort(CS$<>9__CachedAnonymousMethodDelegate3);
15 }
The lines of interest are 8-11. In line 10 we can see how our lambda expression is changed to a standard anonymous method. We can also note that the compiler adds caching to our delegate, so it won't be created twice on the two runs of the same method.
Expression Trees
Now, say we wanted the Sort method to print the comparison algorithm to the console.
Specifically, we want that running something like:
public void SortListIgnoreCase()
{
List<string> list = new List<string>{"abc", "ADE", "dol"};
list.SortWithLogging((s1,s2) => s1.ToLower().CompareTo(s2.ToLower()));
}
Will print this:
Sorting with the following comparison: (s1, s2) => s1.ToLower().CompareTo(s2.ToLower())
Ha? Print a method? Yeah, you heard right, and it's damn easy to do with Expression Trees. Let's look at the sort method we wrote last week, when we talked about extension methods:
public static IEnumerable<T> Sort<T>(this IEnumerable<T> collection)
{
List<T> toSort = new List<T>(collection);
toSort.Sort();
return toSort;
}
We will now change our method to something like this:
public static void SortWithLogging<T>(this IEnumerable<T> collection, Expression<Comparison<T>> comparison)
{
Console.WriteLine("Sorting with the following comparison: " + comparison.ToString());
Comparison<T> actualComparison = comparison.Compile();
List<T> toSort = new List<T>(collection);
toSort.Sort(actualComparison);
}
Now, let's see what we have here. We've added a parameter, Comparison<T> to do the objects comparison when sorting. But wait, the parameter is actually Expression<Comparison<T>>, what the hell? The Expression abstract class and its derivatives can represent, in a strongly type manner, any C# expression. Once we declare a parameter of type Expression<Comparison<T>> and someone passes us a Comparison<T> lambda, the compiler builds an expression for us. We can easily print the expression to the screen, by invoking its ToString method. We can then Compile the expression to recieve the actual comparison delegate and use it.
How is it implemented?
We call this feature expression trees, since every expression is in fact a tree of other Expressions. Let's see how the compiler translates our call to SortWithLogging:
ParameterExpression C1;
ParameterExpression C2;
list.SortWithLogging<string>
(
Expression.Lambda<Comparison<string>>
(
Expression.Call
(
Expression.Call
(
C1 = Expression.Parameter(typeof(string), "s1"),
(MethodInfo) methodof(string.ToLower),
new Expression[0]
),
(MethodInfo) methodof(string.CompareTo),
new Expression[]
{
Expression.Call
(
C2 = Expression.Parameter(typeof(string), "s2"),
(MethodInfo) methodof(string.ToLower), new Expression[0]
)
}
),
new ParameterExpression[] { C1, C2 }
)
);
Yes, WTF?! is supposed to be your reaction. But indeed, the compiler translates s1.ToLower().CompareTo(s2.ToLower()) to this entire tree of expressions, via repeating calls to static methods on the Expression class. I will not start to explain each and every call, but you should understand the concept: The compiler is creating an object representation of a method, including the calls it makes, its use of parameters, everything. The parameters C1 and C2 represent our own two parameters s1 and s2.
One thing that should be said is that the expression building is slow. Very slow. Running Sort 10000 times took 62.5 milliseconds. Running SortWithLogging, even without the Console.WriteLine call, took more than 8 seconds! That's a big difference right there. Also, note that the caching is gone once we pass our lambda to a method that accepts an expression. Apparently, this issue is affecting Linq to SQL in a bad way.
For more cool stuff you can do with expression trees, have a look at using it for strong-typing INotifyPropertyChanged, logging, and mocking. Just beware of that performance hit!