Tuesday, July 7, 2009

Optimizing LINQ Queries using DataLoadOptions


Thanks to everyone who has knowingly / unknowingly helped me to become a Microsoft MVP, hope I can live up to it.

In this section, we will understand the round trip issues of LINQ and how we can overcome the same using ‘DataLoadOptions’. One of the biggest issues with LINQ to SQL is that it fires SQL query for every object which has a huge impact on performance. In this article, we will see how we can get all data in one SQL query.

Catch my videos for WCF, WPF, WWF, LINQ, SilverLight, Design patterns, UML and lot on http://www.questpond.com.

LINQ Basics

This article assumes that you have basic knowledge of how entity objects can be flourished using LINQ. In case you are not aware of basics of LINQ to SQL mapping, you can read my article to understand the basic LINQ concepts.

Customer, Addresses and Phones LINQ Entities

First let’s try to understand how LINQ queries actually work and then we will see how round trips happen. Let’s consider the below database design where we have 3 tables -- customer, addresses and phone. There is one-many relationship between customer and addresses, while there is one-one relationship between address table and phones.

We have created three entities as per the table design:

  • ClsCustomerWithAddresses
  • ClsAddresses
  • ClsPhone

We have defined the relationships between them using ‘EntitySet’ and ‘EntityRef’.

Click to enlarge image

To fill the entity objects with data from table is a 5 step process. As a first step, the datacontext connection is created using the connection string, LINQ query is created and then we start browsing through customer, address and phones.

Click to enlarge image

Analyzing the LINQ SQL Round Trips

Ok, now that we have analyzed that it takes 5 steps to execute a LINQ query, let’s try to figure out on which step the LINQ query actually fires SQL to the database. So what we will do is run the above LINQ code and analyze the same using SQL profiler.

Just so that we do not catch a lot of SQL Server noise, we have only enabled RPC and SQL batch events.

Now when you run the query, you will find the below things:

  • The execution of actual SQL takes place when the foreach statement is iterated on the LINQ objects.
  • The second very stunning thing you will notice is that for every entity, a separate query is fired to SQL Server. For instance, for customer one query is fired and then separate queries for address and phones are fired to flourish the entity object. In other words, there are a lot of round trips.

Click to enlarge image

Avoiding Round Trips using DataLoadOptions

We can instruct LINQ engine to load all the objects using ‘DataLoadOptions’. Below are the steps involved to enable ‘DataLoadOptions’.

The first step is to create the data context class:

See full detail: http://www.codeproject.com/KB/linq/OptimizingLINQ.aspx

No comments: