Entity Framework and Lazy Loading
I've received a couple of request to write some of my previous posts in English so that all other 99.7% of the developers community in the world which finds these subjects interesting can understand what I'm writing.
The following post is a translation to English of this Hebrew post.
When Entity Framework (EF for short) was designed, Microsoft decided the loading of entities will be during run-time in a JIT like mechanism. They achieved this by using the lazy loading technique - access the database and load the entity for the first time only when someone asks for it.
There is some logic to this design - the wish to spare unnecessary work with the database. Still, there are some disadvantages to this design:
-
In order to perform the lazy load, one must invoke the Load method of the RelatedEnd object (either EntityCollection or EntityReference) explicitly.
This means that you must remember to put a load command before accessing the navigation property, otherwise you will get a NullReferenceException or worse - A bug in your application's logic
-
Every call to Load causes a call to the DB, so you must remember while code reviewing to check that prior to each Load operation you have a "If (!navig.IsLoaded)" condition (the IsLoaded is a property of the RelatedEnd object).
-
In case you've built a code that iterates on a collection of entities and performs some operation on each of them, and that entity contains navigable properties, you will find yourself calling Load on each navigation property, for each entity in the collection, which means your DB will be accessed total of EntityCount X NavigablePropertiesCount - not very wise performance wise.
The first and second section is quite annoying to address, because you have to repeat these 2 rows each time you want to navigate a property. There is a solution for this problem by using Transparent Lazy Loading, but it's not part of EF yet (the guys at MS says that they are considering it for V2).
As for the third section, there is a solution in EF for the iterated load, by using the Include method which you can call from your ObjectQuery (for each EntityType you want to query). You can pass to the method a string parameter that holds the name of the navigated property we want to pre-load, and we can even call this method a couple of time, with different navigable properties to pre-load a couple of entities.
What goes on in the DB when you activate the Include method? the query which is executed in the DB contains not only the "select" query for the main entity but also other queries to return the navigable ends. For example, given the following model:
And the following DB table structure:
And given a code that looks like this:
TestModel.TestEntities model = new TestModel.TestEntities();
var all = from a in model.Person.Include("Pets").Include("Address")
select a;
foreach (var person in all.ToList())
{
if (!person.Pets.IsLoaded)
person.Pets.Load();
if (!person.Address.IsLoaded)
person.Address.Load();
Console.WriteLine(
string.Format ("{0} {1}\n{2}\n{3}",
person.FirstName,
person.LastName,
String.Join("\n", (from ad in person.Address
select string.Format("{0} {1} {2}", ad.City, ad.Street, ad.House)).ToArray()),
String.Join("\n", (from p in person.Pets
select string.Format("{0} {1}", p.Name, p.Species)).ToArray())));
Console.WriteLine();
}
When the ToList method will be invoked, the following query will be executed:
SELECT [UnionAll1].[Id] AS [C1],
[UnionAll1].[FirstName] AS [C2],
[UnionAll1].[LastName] AS [C3],
[UnionAll1].[C2] AS [C4],
[UnionAll1].[C1] AS [C5], [UnionAll1].[Id1] AS [C6],
[UnionAll1].[Name] AS [C7],
[UnionAll1].[Species] AS [C8],
[UnionAll1].[PersonId] AS [C9],
[UnionAll1].[C3] AS [C10],
[UnionAll1].[C4] AS [C11],
[UnionAll1].[C5] AS [C12],
[UnionAll1].[C6] AS [C13],
[UnionAll1].[C7] AS [C14]
FROM (SELECT
CASE WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1],
[Extent1].[Id] AS [Id],
[Extent1].[FirstName] AS [FirstName],
[Extent1].[LastName] AS [LastName],
1 AS [C2],
[Extent2].[Id] AS [Id1],
[Extent2].[Name] AS [Name],
[Extent2].[Species] AS [Species],
[Extent2].[PersonId] AS [PersonId],
CAST(NULL AS int) AS [C3],
CAST(NULL AS varchar(1)) AS [C4],
CAST(NULL AS varchar(1)) AS [C5],
CAST(NULL AS varchar(1)) AS [C6],
CAST(NULL AS int) AS [C7]
FROM [dbo].[Person] AS [Extent1]
LEFT OUTER JOIN [dbo].[Pets] AS [Extent2] ON [Extent1].[Id] = [Extent2].[PersonId]
UNION ALL
SELECT
2 AS [C1],
[Extent3].[Id] AS [Id],
[Extent3].[FirstName] AS [FirstName],
[Extent3].[LastName] AS [LastName],
1 AS [C2],
CAST(NULL AS int) AS [C3],
CAST(NULL AS varchar(1)) AS [C4],
CAST(NULL AS varchar(1)) AS [C5],
CAST(NULL AS int) AS [C6],
[Extent4].[Id] AS [Id1],
[Extent4].[City] AS [City],
[Extent4].[Street] AS [Street],
[Extent4].[House] AS [House],
[Extent4].[PersonId] AS [PersonId]
FROM [dbo].[Person] AS [Extent3]
INNER JOIN [dbo].[Address] AS [Extent4]
ON [Extent3].[Id] = [Extent4].[PersonId]) AS [UnionAll1]
Long isn't it? but for this table structure, the query is quite efficient.
Warning: beware of the number of includes you use and the number of levels you "walk" into your navigation tree - the more include methods you call, the bigger your query will become and the more inefficient it will become (have you ever tried to execute a left join on 4 large tables? not a pretty sight.
Another thing you should take into account - the include method tries to identify the navigated property and it's mapping (in order to build the union part of the query) prior to executing the query, which means that if your entity is polymorphic (for example you want to load an EntitySet that holds both Person and Employee), you can only use the include method for the navigated properties of the base class (say Person), which means we don't really have the option to perform an eager load - an implicit load of the entire entity structure, including all navigable properties, ours or our fathers (derived).
If you do wish to load a polymorphic entity, you'll need to perform a couple of Include execution, every time with a different entity type from the hierarchy tree (and combine the results).
But it doesn't end there - what happens if the both the entity and one of the navigated entity are of some base class (for example a Person that holds Cars and Cars is a polymorphic type)? you see where this is heading...
So for conclusion, lazy load is a bit problematic, we can use Include, but we have problems with it too. Eager loading isn't yet supported but that too isn't so simple to use - you'll just have to find the suitable solution for your situation.