LINQ in .NET returning duplicate rows from a working SQL view [SOLVED]

How to correct duplicate row data in Entity Framework and LINQ for .NET

By  

Twice in the past couple of months I’ve gotten tripped up by the same data issue when using LINQ to Entities in the .NET framework. I create a simple view in my database which performs some aggregation to return a resultset containing just two columns. Querying the view returns the expected results. Performing a LINQ query on the view returns incorrect and confusing results.

The Problem

For a little background, the view is performing a JOIN on another table, then grouping the results by an ID and summing another column to create and ID,SUM pair of columns. It looks something like the following:

SELECT b.Id, COUNT(DISTINCT a.UserId) AS users
FROM table1 AS a INNER JOIN
table2 AS b ON a.Id = b.Id
GROUP BY b.Id, a.OtherId
HAVING (COUNT(DISTINCT a.UserId) > 1)

Executing this query in SQL returns results such as:

b.ID SUM
2586 4
2586 3
2586 2
2586 3
2586 5
14395 5
14395 5
14395 3
14395 6
14395 5
14395 3
14395 2
14395 2
14395 2
14395 2
14395 3

In LINQ, I make a simple call to the same view:

var results = dataContainer.MyViewName.ToList();

The result set that is returned from that call looks like this:

b.ID SUM
2586 4
2586 4
2586 4
2586 4
2586 4
14395 5
14395 5
14395 5
14395 5
14395 5
14395 5
14395 5
14395 5
14395 5
14395 5
14395 5

As you can see, the correct number of records for each group is returned but the second column value is the same for each row.

The Cause

The reason this is happening is that the Entity Framework has attempted to identify a unique primary key on the view when you added the view object to your .edmx file. Since no primary key is set on a view, it simply took the first column in the table. The problem here should now be obvious, the first column is clearly not unique.

Since EF thinks the first column is unique, it does some internal data building on the result set to build out the entity object by pairing the rest of the data columns to its matching record based on this “primary” key value. The end result is that the same record is duplicated for every data row since the same value is always assigned to the identical primary key.

The Solution

To solve this problem there are two straightforward options.

The first is that you can simply add a column to your view select statement which is unique. You don’t have to use that column in your program but it will allow EF to properly build each object for you. Of course, base on your query and groupings this may not be possible.

The second option is to add a pseudo key to your view. This option should work in basically every case and is the option that we went with. The new column will create an incrementing integer which is basically a pointer to the row number of the results. To do this in MS SQL Server, add the following to the beginning of your column selection:

ROW_NUMBER() OVER( ORDER BY b.Id) as NID

which alters the example view to:

SELECT ROW_NUMBER() OVER( ORDER BY b.Id) as NID,
b.Id, COUNT(DISTINCT a.UserId) AS users
FROM table1 AS a INNER JOIN
table2 AS b ON a.Id = b.Id
GROUP BY b.Id, a.OtherId
HAVING (COUNT(DISTINCT a.UserId) > 1)

Executing this query in SQL returns results such as:

NID b.ID SUM
1 2586 4
2 2586 3
3 2586 2
4 2586 3
5 2586 5
6 14395 5
7 14395 5
8 14395 3
9 14395 6
10 14395 5
11 14395 3
12 14395 2
13 14395 2
14 14395 2
15 14395 2
16 14395 3

Now if you drop the view from your data model and re-add it, EF will pick up the real primary key column that is labeled NID and your LINQ statement will perform as expected.

Read more of Matthew Mombrea's ByteStream blog and follow Matt on Twitter (@mombrea) and Google+. For the latest IT news, analysis and how-tos, follow ITworld on Twitter and Facebook.

Don't miss...

Why you should jailbreak your Apple TV - and how to do it
Why you should jailbreak your Apple TV - and how to do it
Apple iOS app review - frustrating and bad for your health
Apple iOS app review - frustrating and bad for your health
The best home backup plan options
The best home backup plan options
Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Answers - Powered by ITworld

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Ask a Question
randomness