-- Database by Doug
-- Douglas Kline
-- 10/19/2018
-- ANY and ALL
-- comparing a value to a column of values
-- (aka acomparing a scalar to a vector of scalars)
-- consider this setup for testing
-- if the WHERE condition is true or false
SELECT 1 AS [isTrue?]
WHERE 10 = 10 -- testing if this is true or false
SELECT 1 AS [isTrue?]
WHERE 10 <> 10 -- testing if this is true or false
-- we get a record if it is true, otherwise no record
-- also, see how I can create a literal table
SELECT tempField
FROM (VALUES(11),(12),(7)) tempTable(tempField)
-- note that this creates a single column of values
-- which could be used in something like IN
-- for example
SELECT 1
WHERE 12 IN ( SELECT tempField
FROM (VALUES(11),(12),(7)) tempTable(tempField))
-- I could rephrase this as:
SELECT 1
WHERE 12 = ANY ( SELECT tempField
FROM (VALUES(11),(12),(7)) tempTable(tempField))
-- back to the first example:
SELECT 1 AS [isTrue?]
WHERE 10 > 11
-- I'm comparing a single scalar value, 10, with another
-- single scalar value, 11
-- now consider if I want to compare the value 10
-- to multiple other values
SELECT 1
WHERE 10 > 11
OR 10 > 12
OR 10 > 7
-- this is three logical expressions
-- combined with OR into the whole logical expression
-- so if ANY of them are true
-- the whole logical expression is true
-- now let's say that the values 11, 12, 7 are in a column
-- since we're in an RDBMS, they are *likely* to be in a column
-- I can rephrase this with ANY like this:
SELECT 1
WHERE 10 > ANY (SELECT tempfield
FROM (VALUES
(11),
(12),
(7)
) AS tempTable(tempfield))
-- this is exactly equivalent to:
SELECT 1
WHERE 10 > 11
OR 10 > 12
OR 10 > 7
-- back to the first example again...
-- back to the first example:
SELECT 1 AS [isTrue?]
WHERE 10 > 11
-- let's say I want to do multiple comparisons again
-- but AND them together like this:
SELECT 1
WHERE 10 > 11 -- logical expression 1
AND 10 > 12 -- logical expression 2
AND 10 > 7 -- logical expression 3
-- this is three logical expressions
-- combined with ANDs into the whole WHERE
-- logical expression
-- so all three expressions must be true
-- for the WHERE clause to be true
-- I can make it true with different values:
SELECT 1
WHERE 10 > 9 -- logical expression 1
AND 10 > 8 -- logical expression 2
AND 10 > 7 -- logical expression 3
-- I can rephrase this with ALL
-- with the values in a column
SELECT 1
WHERE 10 > ALL (SELECT tempfield
FROM (VALUES
(11),
(12),
(7)
) AS tempTable(tempfield))
-- and get it to be true with a change to the values
SELECT 1
WHERE 10 > ALL (SELECT tempfield
FROM (VALUES
(9),
(8),
(7)
) AS tempTable(tempfield))
-- so you can think of ANY and ALL
-- as comparison operator modifiers
-- comparison operators usually take a scalar value on each side
-- scalar scalar
-- 3 < 7
-- 3 > 7
-- 3 = 7
-- 3 != 7
-- 3 <> 7
-- 3 <= 7
-- 3 >= 7
-- ANY and ALL take a scalar on the left, and a column on the right
-- scalar ANY column
-- scalar ALL column
-- 3 < ANY ((1),(2),(3))
-- 3 >= ALL ((1),(2),(3))
-- and the column is generally created with a SELECT statement
-- scalar ANY (SELECT FROM...)
-- scalar ALL (SELECT FROM...)
-- so here's a more concrete example using the Northwind database Orders table
-- suppose we want to know a list of customers who paid more than
-- $200 on freight on an order in 1996?
-- in other words $200 < ANY(orders in 1996)
SELECT CompanyName
FROM Customers
WHERE $200 < ANY (SELECT freight
FROM Orders
WHERE Orders.CustomerID = Customers.CustomerID
AND YEAR(Orderdate) = 1996)
ORDER BY CompanyName
-- notice that this is a correlated subquery
-- it refers to the outer query (Customers.customerID)
-- also notice that this could be rephrased
-- with a JOIN DISTINCT
SELECT DISTINCT Customers.CompanyName
FROM Customers
JOIN Orders ON Customers.CustomerID = Orders.CustomerID
WHERE Orders.Freight > $200
AND YEAR(Orders.OrderDate) = 1996
ORDER BY CompanyName
-- is there a difference? why one and not the other
-- it depends
-- depending on your situation, one SQL phrasing might
-- be clearer than another
-- depending on your data model, volume, statistics, indexes, etc.
-- one might be faster than the other
-- in other words, you might get different query plans
-- in general, I recommend writing your SQL in the clearest
-- manner possible
-- then carefully rephrasing to a better performing, but
-- perhaps less understandable, form if there is a performance issue
-- in summary,
-- ANY and ALL modify comparison operators
-- they succinctly compare a single scalar to
-- a column of scalars
-- thanks for watching!
-- Database by Doug
-- Douglas Kline
-- 10/19/2018
-- ANY and ALL
Saturday, October 20, 2018
Using ANY and ALL in SQL
Monday, October 15, 2018
Using the CAST() function in SQL
-- Database by Doug
-- Douglas Kline
-- 10/10/2018
-- CAST - converting to a new data type
-- sometimes you need to change one data type to another data type
-- consider this:
SELECT '4.0'
-- I might want to deal with this as a number
-- for example
SELECT '4.0' + 2.0
-- this works
-- even though they '4.0' is a varchar
-- and 2.0 is a floating point number
-- the db does an *implicit* conversion
-- of the '4.0' to a floating point
-- then does the addition
-- and returns a floating point
-- we hope the db "knows what I mean"
-- and are *assuming* it ends up doing the right thing
-- for simple things, this mostly works
-- the db is pretty smart
-- however, high quality code doesn't normally
-- make assumptions, so let's be *explicit*
SELECT CAST('4.0' AS float) + 2.0
-- here, I'm using the CAST function to
-- *explicitly* change a varchar to a float
-- I'm not relying on the db "knowing what I mean"
-- you can generally CAST between data types
-- fairly freely
-- see the full matrix of allowable conversions
-- for SQL Server 17
-- here: https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-conversion-database-engine?view=sql-server-2017
-- note that some to/from conversions are implicit / automatic
-- some are not allowed at all
-- and some require explicit CASTs
-- here are a few common conversions you might want to do
-- converting numeric and dates to varchars
-- especially when needing to concatenat
SELECT 'Doug' + 1 AS [trying for Doug1]
SELECT 1 + 'Doug' AS [trying for 1Doug]
-- and the fixes
SELECT 'Doug' + CAST(1 AS VARCHAR)
SELECT CAST(1 AS VARCHAR) + 'Doug'
SELECT GETDATE() + 'Doug' -- error
SELECT CAST(GETDATE() AS VARCHAR) + 'Doug'
-- forcing specific types of operations
-- consider this
SELECT 3 / 2
-- notice that I get *integer* division
-- because both operands are integers
-- but what if I want to see 1.5 as the result?
-- I can fix literals easy enough
SELECT 3.0 / 2.0
-- but what about this:
SELECT UnitsInStock / unitsonorder
FROM Products
WHERE unitsOnorder <> 0
-- I'm getting integer division
-- here's how to get floating point division:
SELECT CAST(UnitsInStock AS FLOAT) / CAST(unitsonorder AS FLOAT)
FROM Products
WHERE unitsOnorder <> 0
-- another common conversion is from varchar to date/time
-- see the differences here:
SELECT '4/8/2018'
SELECT CAST('4/8/2018' AS DATE)
SELECT CAST('4/8/2018' AS DATETIME2)
-- and here:
SELECT '20180408 11:00'
SELECT CAST('20180408 11:00' AS DATE)
SELECT CAST('20180408 11:00' AS DATETIME2)
-- summary
-- for 'throw-away' code, the db will implicitly convert
-- for high quality code in a system
-- you should not rely on implicit conversions
-- instead, use CAST
-- Database by Doug
-- Douglas Kline
-- 10/10/2018
-- CAST - converting to a new data type
Wednesday, October 10, 2018
Using DISTINCT in SQL
-- Database by Doug
-- Douglas Kline
-- 10/9/2018
-- DISTINCT - removing duplicates, but in a "dumb" way
-- consider this
SELECT ProductID,
ProductName,
SupplierID
FROM Products
-- this is a list of all products
-- note the repeats in the SupplierID
-- suppose I want top know the list of suppliers
-- in the Products table
SELECT SupplierID
FROM Products
-- again, notice the repeats
-- to remove the repeats, I can do this:
SELECT DISTINCT SupplierID
FROM Products
-- you might be saying, why not just do this:
SELECT SupplierID
FROM Suppliers
-- my response is: that's a different list
-- it's the list of all suppliers in the supplier table
-- What I'm looking for is the list of all suppliers
-- in the products table
-- in other words, all suppliers which we actually *use*
-- Let's look at another example, with a table I've created
SELECT firstName
FROM Person
ORDER BY firstName
-- this is a list of all people's first names
-- as we scroll down, we'll start to see repeats
-- in other words, multiple people have the same first name
-- now let's say we want a list of all first distinct firstnames
SELECT DISTINCT firstName
FROM Person
ORDER BY firstName
-- so now there isn't a record returned for every Person record
-- there's a record returned for every unique firstname
-- also note that NULL is considered to be a unique firstname
-- you might say, why not use Group By to do this?
-- like this:
SELECT firstName
FROM Person
GROUP BY firstName
ORDER BY firstName
-- logically, it returns the exact same records, and always will
-- however, GROUP BY does a lot more work
-- it actually sets up groups of records in preparation to
-- calculate aggregates like SUM, COUNT, AVG, etc.
-- DISTINCT is much faster
-- if it sees a value it has seen before, it just throws it out
-- in other words, it doesn't group the records
-- it just makes a list of unique values
-- so, don't use GROUP BY when what you really need is DISTINCT
-- alright, what if you want to count stuff?
SELECT COUNT(ID) AS [# of people],
COUNT(DISTINCT ID) AS [# of distinct primary key values],
COUNT(firstName) AS [# of people with non-NULL firstnames],
COUNT(DISTINCT firstName) AS [# of distinct firstNames]
FROM Person
-- note that the first two values are always the same,
-- since primary key values are distinct aka unique
-- and also notice that there are 599 distinct first names, but recall
SELECT DISTINCT firstName
FROM Person
ORDER BY firstName
-- so why is the COUNT(DISTINCT firstname) = 599
-- but DISTINCT firstname gives 600 records?
-- remember that COUNT counts non-NULL values
-- finally, DISTINCT is sort of "dumb", in that it doesn't
-- know anything about primary keys
-- or anything about the underlying table(s)
-- it only considers values from the fields you provide
-- consider this:
SELECT DISTINCT firstname
FROM Person
-- it doesn't give distinct Person records, just distinct firstnames
-- now this:
SELECT DISTINCT lastname
FROM Person
-- and the distinct applies to the *combination*
-- of all the fields in the SELECT clause
-- in this example,
-- all distinct *combinations* of gender and firstname are shown
SELECT DISTINCT gender,
firstname
FROM Person
ORDER BY gender,
firstname
-- in summary,
-- distinct removes duplicates
-- it removes duplicates based on all fields in the SELECT list
-- when used with COUNT, it will not count duplicate values
-- thanks for watching!
-- Database by Doug
-- Douglas Kline
-- 10/9/2018
-- DISTINCT - removing duplicates, but in a "dumb" way
Thursday, October 4, 2018
Numeric Expressions in SQL
-- Database by Doug
-- Douglas Kline
-- 8/28/2018
-- Numeric Expressions - expressions that return a numeric data type
-- beginner level
-- simple numeric expression examples
-- and things to keep in mind
-- I might say 'db'
-- which is shorthand for 'database engine'
-- this is the server software that interprets
-- your SQL and performs actions
-- you can ask your database engine (db) to be your calculator:
SELECT 1 + 2
-- if you are familar with spreadsheets
-- and how to type formulas
-- it is very similar
SELECT 1 + 2
-- an expression is something that the db
-- will read, translate, and perform
-- so in the expression above
-- it reads a 1, then a plus sign, then a 2
-- interprets it as addition
-- performs the addition
-- and returns the results
--------------------------------------------------
SELECT 1 + 2
-- there are certain items in the expression
-- the plus sign (+) is an *operator*
-- more specifically, it is a *binary* operator
-- it defines an operation on two items (thus binary)
-- more specifically, the items on its left and right
-- the plus sign says to the db:
-- take the items on my left and right and add them
-- it is an instruction to the db
SELECT 1 + 2
-- the 1 and 2 are the *operands* of the operator
-- they are the items that the operator applies to
-- the 1 and 2 are also examples of *literals*
-- in other words 1 literally means the number 1
-- in coding, it means: not code, a value
-- here is an example of a unary operator (one operand)
SELECT -5
-- here, the literal value 5
-- is modified to become negative
-- by the minus sign operator
------------------------------------------------------------
-- how does the db know what *is* an expression, and what *isn't*
-- in this example
SELECT 1 + 2
-- the db is *expecting* an expression to be in this position
-- (after the word SELECT)
-- so expressions are mainly known by their
-- location/position in your SELECT statement
---------------------------------------------------------
-- notice that usual order of operations is followed
-- * and / come before + and -
-- otherwise it is left-to-right,
-- unless you add parentheses
-- then sub-expressions in parentheses are evaluated first
SELECT 1 + 2 * 6
SELECT (1 + 2) * 6
SELECT (3 - 1) / 2
------------------------------------------------------
-- notice that operators sometimes change meaning
-- depending on their context
SELECT 1 / 3 -- does integer division
SELECT 1.0 / 3.0 -- floating point division
SELECT 1 / 3.0 -- floating point division
SELECT 1.0 / 3 -- floating point division
-- if both operands (items on either side of the operator)
-- are whole numbers, do integer division
-- otherwise do floating point division
-----------------------------------------------------------------------------
-- here's an operator that might be new to you: %
SELECT 5 % 3 -- modulus operator - gives the remainder after integer division
SELECT 8 % 3 -- same answer
SELECT (8 / 3) * 3 + 8 % 3 -- equals 8
-----------------------------------------------------------------------------
-- what if your data is not a literal?
-- in other words, it comes out of the db
SELECT unitsInstock -- this is an expression, just very simple
FROM Products
SELECT unitsInStock + unitsOnOrder -- effective inventory
FROM Products
-- plus is still the operator
-- the operands are UnitsInStock and unitsOnOrder
-- the expression is evaluated on each record, separately
-----------------------------------------------------
-- but what if the data type is not right for
-- the operations I want to do?
-- specifically, the data is whole numbers
-- but I want to do floating point division
SELECT unitsInStock,
reOrderLevel,
unitsinstock / reOrderLevel
FROM Products
WHERE reOrderLevel <> 0 -- to avoid division by zero
-- but I'd really like to see the results as a floating point/decimal
-- you have to convert *prior* to dividing
SELECT unitsInStock,
reOrderLevel,
CONVERT(real,unitsinstock) / reOrderLevel
FROM Products
WHERE reOrderLevel <> 0 -- to avoid division by zero
-- the unitsInStock is converted to a real (aka floating point number)
-- before it gets divided
-- since one of the items operands
-- is a floating point number, it does floating point division
-------------------------------------------------------------
-- perhaps a more realistic and useful expression
SELECT ProductID,
ProductName,
unitsInStock * unitprice AS [Dollar Value in Inventory]
FROM Products
ORDER BY ProductID
-- note that unitprice is a money data type,
-- and the unitsInstock is a whole number
-- so the result is shown as a money data type
-- (note the 2 digits to the right of the decimal)
-- we can show this explicitly:
SELECT SQL_VARIANT_PROPERTY(unitPrice, 'BaseType'),
SQL_VARIANT_PROPERTY(unitsInStock, 'BaseType'),
SQL_VARIANT_PROPERTY(unitsInStock * unitPrice, 'BaseType')
FROM Products
-----------------------------------------------------------------------
-- in summary
-- expressions are instructions for the db to create new values
-- expressions are known by their location in the statement
-- expressions have operators and operands
-- Database by Doug
-- Douglas Kline
-- 8/28/2018
-- Numeric Expressions
Subscribe to:
Posts (Atom)