A short video covering the GROUP BY clause in SQL.
Part 2: covering how NULL values are handled and Counts For beginners.Here is the SQL that goes with the demo.
-- Database by Doug -- Douglas Kline -- 2/4/2016 -- GROUP BY - Part 2 NULL Handling & Count USE Northwind -- see GROUP BY - Part 1 Introduction -- what about NULL values? How are they handled? SELECT SUM(unitprice)/COUNT(ProductID) AS [manual average unitprice], AVG(unitprice) AS [correct average unitprice], SUM(unitprice)/COUNT(ProductID)-AVG(unitprice) AS [difference] FROM products -- normally, you would expect these two averages to agree -- but I've added a product with a NULL unitprice: SELECT * FROM Products WHERE unitprice IS NULL -- so let's look at all the pieces to see what's happening SELECT SUM(unitprice) AS [total of all unitprices], COUNT(ProductID) AS [number of records], SUM(unitprice)/COUNT(ProductID) AS [manual average unitprice], AVG(unitprice) AS [correct average unitprice] FROM products -- the problem here is that there are 79 total records -- but only 78 non-NULL unitprices -- the AVG function handles the NULL appropriately -- and only divides by 78 -- in general, only non-NULL values are included in MIN, MAX, SUM, STDEV, VAR, AVG, etc. -- SO, depend on the functions, rather than writing your own -- they will appropriately handle NULL values SELECT AVG(unitprice), SUM(unitprice), STDEV(unitprice), VAR(unitprice) FROM Products -- COUNT needs more explanation -- note the difference in the two number returned below: SELECT COUNT(ProductID) AS [number of non-NULL productIDs], COUNT(unitprice) AS [number of non-NULL unitPrices] FROM Products -- COUNT requires a field to be specified -- it will count records that have a non-NULL value in that field -- but what about this? SELECT COUNT(*) AS [count of products] FROM Products -- recall that an asterisk in the SELECT clause is shorthand for 'all fields' -- so the above can be interpreted as -- "check all fields for non-NULL values, and count if any field has a non-NULL value" -- of course, if a table has a primary key, then the table can be guaranteed -- to have at least one field that is non-NULL (the primary key field) -- for code clarity, the asterisk should be avoided -- it is much clearer to count the primary key of the entity that you intend to count -- as an example: SELECT COUNT(*) AS [what is being counted here?] FROM Products INNER JOIN Categories ON Products.CategoryID = Categories.CategoryID -- this is much clearer -- we're counting products, because we are counting the primary key -- of the products table SELECT COUNT(Products.ProductID) AS [product count] FROM Products INNER JOIN Categories ON Products.CategoryID = Categories.CategoryID -- but why are their only 77 records, and not 79? -- because only 77 products satisfy the JOIN with Categories: SELECT ProductID, CategoryID FROM Products ORDER BY CategoryID -- Database by Doug -- Douglas Kline -- 2/4/2016 -- GROUP BY - Part 2 NULL Handling & Count
No comments:
Post a Comment