ORM Queries: Premature Materialization (Part 3)
Series on ORM Queries
-
ORM Queries: Too Many Includes
-
ORM Queries: Too Many Roundtrips
-
ORM Queries: Premature Materialization (Part 3)
-
ORM Queries: Premature Materialization (Part 2)
-
ORM Queries: Premature Materialization (Part 1)
-
ORM Queries: Problematic Lazy Loading
-
ORM Queries: SELECTing All Columns
-
ORM Queries: Too Much NULL Checking
Recap
If you just showed up, I’d recommend going back to part 1 to see what this is all about, and then check out part 2 about the infamous System.NotSupportedException.
Okay, on to the fun stuff.
Paging Through a List of Results
I’m creating a page that lets me view all the Stack Overflow users with more than 10,000 reputation points. In the 2010 Stack Overflow database, that’s a little over 8,000 users. I’ve been asked to implement paging, as end users only want to see around 20 users at a time. Here’s my code:
int page = 1;
int rowsPerPage = 20;
var users = db.StackOverflowUsers
.Where(u => u.Reputation > 10000)
.OrderByDescending(u => u.Reputation)
.ToList();
var pagedUsers = users
.Skip((page - 1) * rowsPerPage)
.Take(rowsPerPage);
foreach (var user in pagedUsers)
{
Console.WriteLine($"{user.DisplayName} - {user.Reputation}");
}
What’s the Damage
This has similar problems to the query in part 2, just on a smaller scale. The 8,000+ user objects are loaded into RAM on the app server, taking up 16 MB, but only 20 are displayed in the app.
Here’s the generated query and execution plan, as expected:
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[AboutMe] AS [AboutMe],
/* all the columns */
[Extent1].[AccountId] AS [AccountId]
FROM [dbo].[StackOverflowUsers] AS [Extent1]
WHERE [Extent1].[Reputation] > 10000
ORDER BY [Extent1].[Reputation] DESC
In addition to scanning the whole clustered index, the query requires a 43.6 MB memory grant to sort the 8,000 rows. What’s worse is that, because of the specific data in this set, only a little over 6 MB of the memory is used - the rest is a waste.
Solving
The solution here is to know that the .Skip
and .Take
LINQ extension methods are supported in EF and NHibernate. Here’s the updated C# code:
var users = db.StackOverflowUsers
.Where(u => u.Reputation > 10000)
.OrderByDescending(u => u.Reputation)
.Skip((page - 1) * rowsPerPage)
.Take(rowsPerPage)
.Select(u => new { u.DisplayName, u.Reputation })
.ToList();
And it produces this lovely T-SQL:
SELECT
[Extent1].[Reputation] AS [Reputation],
[Extent1].[DisplayName] AS [DisplayName]
FROM [dbo].[StackOverflowUsers] AS [Extent1]
WHERE [Extent1].[Reputation] > 10000
ORDER BY row_number() OVER (ORDER BY [Extent1].[Reputation] DESC)
OFFSET 0 ROWS FETCH NEXT 20 ROWS ONLY
Now that’s much nicer! Unfortunately, the execution plan isn’t awesome. It gets a much smaller memory grant of 3.8 MB (because of the smaller select list, and the row goal).
But now it has a parallel zone now (it’s just barely over the default Cost Threshold for Parallelism value of 5), and does slightly more reads than the previous query. This could be fixed with an index on Reputation:
CREATE NONCLUSTERED INDEX IX_Reputation_Includes
ON dbo.StackOverflowUsers (Reputation DESC)
INCLUDE (DisplayName);
No sort, no memory grant, no parallelism - just 3 logical reads.
On Older Versions
You might be thinking “I don’t have that new-fangled OFFSET
/ FETCH
stuff on my database server.”
This approach also degrades gracefully if you are on an older version of SQL Server that doesn’t have OFFSET
/ FETCH
. If I point my query at a copy of the database on a 2008 instance, Entity Framework generates the T-SQL instead (without any code changes):
SELECT TOP (20)
[Filter1].[Reputation] AS [Reputation],
[Filter1].[DisplayName] AS [DisplayName]
FROM (
SELECT
[Extent1].[DisplayName] AS [DisplayName],
[Extent1].[Reputation] AS [Reputation],
row_number() OVER (ORDER BY [Extent1].[Reputation] DESC) AS [row_number]
FROM [dbo].[StackOverflowUsers] AS [Extent1]
WHERE [Extent1].[Reputation] > 10000
) AS [Filter1]
WHERE [Filter1].[row_number] > 0
ORDER BY [Filter1].[Reputation] DESC
Not that any of us are using soon-to-be-deprecated software…right? 😜
Not Next Up
Okay, I think that’s it for premature materialization. I hope this helps you to recognize these queries when you see them, and provides some concrete guidance on what to do to improve the situation.
Happy ORMing!