Sunday, January 24, 2016

Using the IN Operator in SQL

A short video covering the IN operator in the WHERE clause within a SQL SELECT statement.

For beginners. Simple WHERE clauses using IN.


Here is the SQL that goes with the demo.

-- Database by Doug
-- Douglas Kline
-- 1/20/2016
-- Using the IN operator

USE northwind

-- refresher

-- here's a simple SELECT statement with a simple WHERE clause

SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    UnitPrice > 10
ORDER BY ProductID

-- recall that the portion in the WHERE clause is a logical expression
-- the logical expression is evaluated for each record
-- records that evaluate (Unitprice > 10) as True (1) are included
-- records that evaluate (Unitprice > 10) as False (0) are not included

-- here's a simple IN example

SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    CategoryID     IN (2, 4)
ORDER BY ProductID

-- all records that have a CategoryID of either 2 or 4 are included in the results
-- so the above could be equivalently rephrased as

SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    CategoryID = 2
   OR    CategoryID = 4
ORDER BY ProductID

-- but this can be inconvenient and verbose if there are lots of desired values, like this:

SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    CategoryID = 2
   OR    CategoryID = 4
   OR    CategoryID = 5
   OR    CategoryID = 8
   OR    CategoryID = 67
   OR    CategoryID = 101
ORDER BY ProductID

-- which would be more succinctly phrased as

SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    CategoryID     IN (2, 4, 5, 8, 67, 101)
ORDER BY ProductID

-- here's another example that can lead to bugs:
SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    CategoryID = 2
   OR    CategoryID = 4
   OR    CategoryID = 5
   OR    CategoryID = 8
   OR    CategoryID = 67
   OR    CategoryID = 101
   AND   UnitPrice > 10
ORDER BY ProductID

-- note that the record with ProductID 13, 'Konbu' has a unitprice < 6
-- since the ORs and ANDs are evaluated left-to-right, the Unitprice > 10 has no effect

-- so parentheses are needed to control the order of operations, like this:
SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    (
         CategoryID = 2
   OR    CategoryID = 4
   OR    CategoryID = 5
   OR    CategoryID = 8
   OR    CategoryID = 67
   OR    CategoryID = 101
         )
   AND   UnitPrice > 10
ORDER BY ProductID

-- this is MUCH clearer, and less conducive to bugs, phrased with IN
SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    CategoryID IN (2, 4, 5, 8, 67, 101)
   AND   UnitPrice   > 10
ORDER BY ProductID

-- here's another way that IN is very convenient
-- back to the long example of ORs
SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    CategoryID = 2
   OR    CategoryID = 4
   OR    CategoryID = 5
   OR    CategoryID = 8
   OR    CategoryID = 67
   OR    CategoryID = 101
ORDER BY ProductID

-- what if we want CategoryIDs NOT equal to 2, 4, 5, etc
-- this doesn't work:
SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    NOT
         CategoryID = 2
   OR    CategoryID = 4
   OR    CategoryID = 5
   OR    CategoryID = 8
   OR    CategoryID = 67
   OR    CategoryID = 101
ORDER BY ProductID

-- note the categoryID = 8 for productID=10
-- so we would need parentheses like this:
SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    NOT
         (
         CategoryID = 2
   OR    CategoryID = 4
   OR    CategoryID = 5
   OR    CategoryID = 8
   OR    CategoryID = 67
   OR    CategoryID = 101
         )
ORDER BY ProductID

-- or more verbosely, but without parentheses
SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    NOT CategoryID = 2
   AND   NOT CategoryID = 4
   AND   NOT CategoryID = 5
   AND   NOT CategoryID = 8
   AND   NOT CategoryID = 67
   AND   NOT CategoryID = 101        
ORDER BY ProductID

-- so, again, this is much clearer and less prone to bugs phrase with IN:
SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    CategoryID NOT IN (2, 4, 5, 8, 67, 101)
ORDER BY ProductID

-- or equivalently
SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    NOT CategoryID IN (2, 4, 5, 8, 67, 101)
ORDER BY ProductID

-- technically, 'IN' is comparison operator, like '=' or '<' 
-- a phrase like '7 > 8' compares the value on the left with the value on the right
-- and returns a true or false

-- a phrase like '7 IN (4, 5, 8) compares the value on the left with 
--  the list of values on the right, and returns a true or false

-- IN can be used with other data types:

SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    ProductName IN ('Chai', 'Tofu', 'Alice Mutton')
ORDER BY ProductID

-- for character based items, the above works, 
-- but you can't combine LIKE and IN, by including LIKE wildcards:
SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    ProductName IN ('Chai%', 'Tofu', 'Alice Mutton')
ORDER BY ProductID

-- the above doesn't work - % is interpreted as a regular character, not a wildcard for matching

-- here's one using dates

SELECT   *
FROM     Employees
WHERE    BirthDate IN ('19481208', '19520219')

-- another very cool feature of the IN operator, 
-- is the ability to dynamically create the list of values
-- with a SELECT statement

SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    ProductName IN (  SELECT ProductName 
                           FROM   Products 
                           WHERE  unitprice > 10)
ORDER BY ProductID

-- of course, this is a very simple example,
-- and it would be much more clearer to phrase it like this:
SELECT   ProductID,
         CategoryID,
         SupplierID,
         ProductName, 
         UnitPrice
FROM     Products
WHERE    unitprice > 10
ORDER BY ProductID

-- but the idea is very powerful
-- and allows for greater flexibility for phrasing your SQL
-- in some cases, a sub query with IN is the clearest way to state what you intend

-- in summary
-- IN is a logical operator - it returns True/False (1/0)
-- IN is sometimes used to replace a series of logical expressions connected with ORs
-- IN operates on a reference value and a list of values
-- IN returns True(1) if the reference value is *in* the list of values
-- IN - the list of values can be the result of a SELECT statement

-- Database by Doug
-- Douglas Kline
-- 1/20/2016
-- Using the IN operator


No comments:

Post a Comment

Followers