A short video covering the GROUP BY clause in SQL.
Part 2: covering how NULL values are handled and Counts For beginners.Here is the SQL that goes with the demo.
-- Database by Doug
-- Douglas Kline
-- 2/4/2016
-- GROUP BY - Part 2 NULL Handling & Count
USE Northwind
-- see GROUP BY - Part 1 Introduction
-- what about NULL values? How are they handled?
SELECT SUM(unitprice)/COUNT(ProductID) AS [manual average unitprice],
AVG(unitprice) AS [correct average unitprice],
SUM(unitprice)/COUNT(ProductID)-AVG(unitprice) AS [difference]
FROM products
-- normally, you would expect these two averages to agree
-- but I've added a product with a NULL unitprice:
SELECT *
FROM Products
WHERE unitprice IS NULL
-- so let's look at all the pieces to see what's happening
SELECT SUM(unitprice) AS [total of all unitprices],
COUNT(ProductID) AS [number of records],
SUM(unitprice)/COUNT(ProductID) AS [manual average unitprice],
AVG(unitprice) AS [correct average unitprice]
FROM products
-- the problem here is that there are 79 total records
-- but only 78 non-NULL unitprices
-- the AVG function handles the NULL appropriately
-- and only divides by 78
-- in general, only non-NULL values are included in MIN, MAX, SUM, STDEV, VAR, AVG, etc.
-- SO, depend on the functions, rather than writing your own
-- they will appropriately handle NULL values
SELECT AVG(unitprice),
SUM(unitprice),
STDEV(unitprice),
VAR(unitprice)
FROM Products
-- COUNT needs more explanation
-- note the difference in the two number returned below:
SELECT COUNT(ProductID) AS [number of non-NULL productIDs],
COUNT(unitprice) AS [number of non-NULL unitPrices]
FROM Products
-- COUNT requires a field to be specified
-- it will count records that have a non-NULL value in that field
-- but what about this?
SELECT COUNT(*) AS [count of products]
FROM Products
-- recall that an asterisk in the SELECT clause is shorthand for 'all fields'
-- so the above can be interpreted as
-- "check all fields for non-NULL values, and count if any field has a non-NULL value"
-- of course, if a table has a primary key, then the table can be guaranteed
-- to have at least one field that is non-NULL (the primary key field)
-- for code clarity, the asterisk should be avoided
-- it is much clearer to count the primary key of the entity that you intend to count
-- as an example:
SELECT COUNT(*) AS [what is being counted here?]
FROM Products
INNER JOIN Categories ON Products.CategoryID = Categories.CategoryID
-- this is much clearer
-- we're counting products, because we are counting the primary key
-- of the products table
SELECT COUNT(Products.ProductID) AS [product count]
FROM Products
INNER JOIN Categories ON Products.CategoryID = Categories.CategoryID
-- but why are their only 77 records, and not 79?
-- because only 77 products satisfy the JOIN with Categories:
SELECT ProductID,
CategoryID
FROM Products
ORDER BY CategoryID
-- Database by Doug
-- Douglas Kline
-- 2/4/2016
-- GROUP BY - Part 2 NULL Handling & Count
No comments:
Post a Comment