Select N+1 Problem – How to Decrease Your ORM Performance

  • submit to reddit

Gil Fink is an expert in ASP.NET and Microsoft data platform and serves as a Senior .Net Consultant and Architect at Sela Group. He is a Microsoft data platform MVP and a certified MCPD Enterprise Application Developer. Gil has worked in the past in variety of positions and projects as a leading developer, team leader, consultant and more. His interests include Entity Framework, Enterprise Library, WCF, LINQ, ADO.NET and many other new technologies from Microsoft. You can find his Blog here: http://www.gilfink.net Gil is a DZone MVB and is not an employee of DZone and has posted 92 posts at DZone. You can read more from them at their website. View Full User Profile

Today one of the developers at my main customer showed me a code snippet he wrote against Entity Framework and made me very pale. The code included the horrible select N+1 problem. This post will introduce the select N+1 pitfall and will explain how avoid it in Entity Framework.

What is Select N+1 Problem?

ORMs can help you to address the impedance mismatch between relational databases and object oriented models and by that make your life simpler. But not knowing about some of their pitfalls can decrease your performance dramatically. One of those pitfalls is the select N+1 problem. This problem is being caused mainly because most of the ORMs out there are enabling lazy loading behavior by default. When we have a parent-children relation the problem can raise its ugly head. The problem is happening when we are executing a single query and then N following queries (N is the number of parent entities) in order to query for something. As you can expect doing N+1 queries instead of a single one will flood your database with queries that we can and should avoid. This is very unacceptable.

Select N+1 Example

To explain the problem more properly lets look at an example. Lets say that we have the following model:
[1]
    

A department can hold 0 or more courses (a typical parent-children relation). Since EF4 enables the lazy loading behavior by default then the following code will raise the select N+1 problem:

using (var context = new SchoolEntities())
{
foreach (var department in context.Departments)
{
foreach (var course in department.Courses)
{
Console.WriteLine("{0}: {1}", department.Name, course.Title);
}
}
}

And the result is:

[2] 
All I wanted to do is to write to the output the titles of the courses and attach to them their parent department name. In the database I got one query to retrieve all the departments and then N queries to retrieve each and every one of the courses for that department. Since in my database there are only 4 departments then I got 5 queries (1 for departments and 4 for all the courses for each department). Now in real world scenario when there are many parents… you can figure the amount of queries you’ll be generating without even knowing you did that.

How to Avoid the Problem in Entity Framework?

One of the main solutions to the select N+1 problem in Entity Framework is to use the Include method.
The Include method is making an eager load for the children that you indicate to it. You give the method a path of all the children you like to load in the query (as long as you have a relation between the entities) and one query will be generated to bring back all the relevant entities. This isn’t a bullet proof solution! There are serious implications that you should understand when you use the Include method. The main implication is that it is doing a join between all the tables that you want to return and the data is retrieved in a flatten manner in order to materialize all the entities from it. Also the materialization process when having a lot of included entities can cause a downgrade of performance. So you will have to weight the balance between using Include or lazy loading. The following code will generate the same results as in the above figure but with only one query:

using (var context = new SchoolEntities())
{
foreach (var department in context.Departments.Include("Courses"))
{
foreach (var course in department.Courses)
{
Console.WriteLine("{0}: {1}", department.Name, course.Title);
}
}
}

and take a look at the generated query:

SELECT   [Project1].[DepartmentID]  AS [DepartmentID],
[Project1].[Name] AS [Name],
[Project1].[Budget] AS [Budget],
[Project1].[StartDate] AS [StartDate],
[Project1].[Administrator] AS [Administrator],
[Project1].[C1] AS [C1],
[Project1].[CourseID] AS [CourseID],
[Project1].[Title] AS [Title],
[Project1].[Days] AS [Days],
[Project1].[Time] AS [Time],
[Project1].[Location] AS [Location],
[Project1].[Credits] AS [Credits],
[Project1].[DepartmentID1] AS [DepartmentID1]
FROM (SELECT [Extent1].[DepartmentID] AS [DepartmentID],
[Extent1].[Name] AS [Name],
[Extent1].[Budget] AS [Budget],
[Extent1].[StartDate] AS [StartDate],
[Extent1].[Administrator] AS [Administrator],
[Extent2].[CourseID] AS [CourseID],
[Extent2].[Title] AS [Title],
[Extent2].[Days] AS [Days],
[Extent2].[Time] AS [Time],
[Extent2].[Location] AS [Location],
[Extent2].[Credits] AS [Credits],
[Extent2].[DepartmentID] AS [DepartmentID1],
CASE
WHEN ([Extent2].[CourseID] IS NULL) THEN CAST(NULL AS int)
ELSE 1
END AS [C1]
FROM [dbo].[Department] AS [Extent1]
LEFT OUTER JOIN [dbo].[Course] AS [Extent2]
ON [Extent1].[DepartmentID] = [Extent2].[DepartmentID]) AS [Project1]
ORDER BY [Project1].[DepartmentID] ASC,
[Project1].[C1] ASC

Summary

There are pitfalls when we are using ORMs and one of them is the select N+1 problem. This isn’t a problem of Entity Framework only. This problem exists in other ORMs like NHibernate, LINQ to SQL and more. You should be aware of those problems when you develop with ORMs and avoid them whenever it is possible. One way to do that is the Include method in Entity Framework but this solution also can generate problems.

References
0
Average: 5 (1 vote)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Frederic Bellier replied on Mon, 2010/09/06 - 2:05pm

It would be best to teach young developers that ORM is not required. It has been tried forever and simply the idea cannot succeed. I don't mean it does not work - I mean it cannot work well. Good computer science curriculum teach this very well and much has been written about this already and I am not going to repeat the reasons here - just google the topic if you want more context. But in a nutshell it comes down to using qn OO language to do what a set language is suppose to do. When I ask developers why they use an ORM tool I always get the same answer: that way we don't have to write SQL. That is the issue - my advice to them: learn SQL. Writing SQL is easy and knowing it is a must. YOU WILL NEVER BEAT THE PERFORMANCE OF A SET LANGUAGE. I enjoy having my DB entities represented as object as any other developer. But I don't need inadequate ORM technology to do that: - Just use simple tools to map your objects and your DB entities (like iBatis). - Reduce the dependencies between your objects. Keep them to a minimum and make sure these relations are not going to hurt your performance. - keep your entity objects in a cache (again with the minimum relationship) having little relationship makes using the objects a little less fluent but it will make sure your app is simple ans still screaming fast. - Use SQL (dynamic and stored procedures) to perform set operations - the speed will be the best possible. - finally use your cache to retrieve the data you need once the set operation has returned. Your welcome.

Hassan Turhal replied on Sun, 2012/01/22 - 12:28pm

One of the items that scottgu recently tweeted about is that in upcoming versions of EF4 that you will be able to load items without doing the fuzzy name matching namely you will be able to write something like this.

northwind.Where(p=>p.UnitsInStock > 9).Include(p=>p.Category)

Using Include when needed can be very handy.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.