Monday, February 15, 2016

Simple Scalar Subqueries

A short video covering simple scalar subqueries in SQL.


Here is the SQL that goes with the demo.

-- Database by Doug
-- 2/15/2016
-- Intro to Simple Scalar Subqueries

use northwind

-- 
SELECT   ProductID,
         ProductName,
         UnitPrice
FROM     Products

-- suppose we want to show all products 
--  with above-average unit price

-- we can find the average unitprice

SELECT   AVG(unitprice)    AS [average unitprice]
FROM     Products

-- but which products are they?

-- this doesn't work:
SELECT   ProductID,
         AVG(unitprice)   AS [average unitprice]
FROM     Products

-- I'd have to group, which gives this:
SELECT   ProductID,
         AVG(unitprice)   AS [average unitprice]
FROM     Products
GROUP BY ProductID

-- then limit it? that doesn't work?
SELECT   ProductID,
         AVG(unitprice)   AS [average unitprice]
FROM     Products
WHERE    unitprice > AVG(unitprice)
GROUP BY ProductID

-- how about having? darn!
SELECT   ProductID,
         AVG(unitprice)   AS [average unitprice]
FROM     Products
GROUP BY ProductID
HAVING   unitprice > AVG(unitprice)

-- none of those things works
-- so let's look at the average again:
SELECT   AVG(unitprice)    AS [average unitprice]
FROM     Products

-- then use it in a second query
-- value = 28.6501

SELECT   ProductID,
         ProductName,
         UnitPrice
FROM     Products
WHERE    unitprice> 28.6501
ORDER BY unitprice

-- however, it took two steps
-- and in a dynamic environment
-- the average value would have changed
-- in the time between execution of the two SELECT statements

-- what we'd like to do is have these two queries 
-- executed as a single query 

-- well, consider this again
-- notice that it returns a 
-- 1-row, 1-column table

SELECT   AVG(unitprice)    AS [average unitprice]
FROM     Products

-- and convince yourself that it 
-- will *always* return a
-- 1-row, 1-column table

-- even this:
SELECT  AVG(unitprice)     AS [avg of a NULL]
FROM    Products
WHERE   unitprice IS NULL

SELECT  COUNT(productID)     AS [proof of a NULL unitprice]
FROM    Products
WHERE   unitprice IS NULL

-- and this:
SELECT   AVG(unitprice)    AS [average of no records]
FROM     Products
WHERE    1 = 0

-- in SQL, you can 
-- interchange a 1-row 1-col table for a scalar value

-- which is what we need in this query:

SELECT   ProductID,
         ProductName,
         UnitPrice
FROM     Products
WHERE    unitprice> (28.6501) -- scalar value in parens
ORDER BY unitprice


-- so here's the simple scalar subquery
SELECT   ProductID,
         ProductName,
         UnitPrice
FROM     Products
WHERE    unitprice > ( SELECT   AVG(unitprice) 
                       FROM     Products ) 
ORDER BY unitprice

-- notice that this is executed as a single query
-- no delay between calculating the average
-- and producing the record that has the average

-- this is a Simple Scalar Subquery

-- it's a subquery, because it is a fully-formed SELECT statement
-- it's scalar, because it is used in place of a scalar value
-- it's simple, because it can be executed independently of outer query
--  (it's *not* a correlated subquery - topic for another video)

-- a couple of caveats
-- * you can't put an ORDER BY in the subquery
-- * you can't rename columns in the subquery
-- these items don't make much sense int he context of a subquery

-- more examples
-- the idea with these Simple Scalar Subqueries,
-- is that you are SELECTing records that are special relative to the group

-- highest price product
SELECT   ProductID,
         ProductName,
         UnitPrice
FROM     Products
WHERE    unitprice = ( SELECT   MAX(unitprice) 
                       FROM     Products ) 
ORDER BY unitprice

-- lowest price product
SELECT   ProductID,
         ProductName,
         UnitPrice
FROM     Products
WHERE    unitprice = ( SELECT   MIN(unitprice) 
                       FROM     Products ) 
ORDER BY unitprice

-- below average price product
SELECT   ProductID,
         ProductName,
         UnitPrice
FROM     Products
WHERE    unitprice < ( SELECT   AVG(unitprice) 
                       FROM     Products ) 
ORDER BY unitprice

-- products more than two standard deviations 
-- above the average

SELECT   ProductID,
         ProductName,
         UnitPrice
FROM     Products
WHERE    unitprice > ( SELECT   AVG(unitprice)
                              + (2.0 * STDEV(unitprice))
                       FROM     Products ) 
ORDER BY unitprice

-- outlier products
-- products more than two standard deviations
-- away from the mean, above or below

SELECT   ProductID,
         ProductName,
         UnitPrice
FROM     Products
WHERE    unitprice > ( SELECT   AVG(unitprice)
                              + (2.0 * STDEV(unitprice))
                       FROM     Products ) 
    OR   unitprice < ( SELECT   AVG(unitprice)
                              - (2.0 * STDEV(unitprice))
                       FROM     Products ) 
ORDER BY unitprice

-- thanks for watching!

-- Database by Doug
-- 2/15/2016
-- Intro to Simple Scalar Subqueries

No comments:

Post a Comment

Followers