Wednesday, January 28, 2015

Always define a Unique Grouping

I've written before about providing a unique ordering. The idea being that downstream systems and developers could begin to depend on records always coming back in the same order. Without having something truly unique in your ORDER BY clause (like a primary key), results could be coming back in different orders.

When using GROUP BY, its even more important to ensure that the grouping is always going to be unique.

Let's look at a fairly obvious example using the Products table from the Northwind database.


Consider this statement:

    SELECT   ProductName,
             AVG(unitPrice) AS [AvgUnitPrice]
    FROM     Products
    GROUP BY ProductName
    ORDER BY ProductName


This is perfectly valid syntax. The resulting records will look fine also.

However, there is a chance that two (or five) products will have the same name, and will get placed into the same group. In this case, the results of this statement are not correct.

The safest thing to do, for this query is to add the ProductID to the GROUP BY clause like this:

    SELECT   ProductName,
             AVG(unitPrice) AS [AvgUnitPrice]
    FROM     Products
    GROUP BY ProductName,
             ProductID
    ORDER BY ProductName


This ensures that the grouping is correct, and that the averages will be correct. However, it still has the problem that the ordering is now not necessarily unique. (As addressed here.)

To also provide a unique ordering, you'd add the ProductID to the ORDER BY also:

    SELECT   ProductName,
             AVG(unitPrice) AS [AvgUnitPrice]
    FROM     Products
    GROUP BY ProductName,
             ProductID
    ORDER BY ProductName,
             ProductID





No comments:

Post a Comment

Followers