Thursday, October 4, 2018

String Expressions in SQL

-- Database by Doug
-- Douglas Kline
-- 8/28/2018
-- Character Expressions - expressions that return character data

-- beginner level
-- might want to check out numeric expressions
------------------------------------------------------------------
-- consider this statement

SELECT   'Fred',
         'Flintstone'

-- note that the single quotes are not shown in
-- the table results
-- this is because that are not *part of* the data
-- they are used to mark the beginning and end of the data
-- and are called *delimiters*

-- so the statement has two pieces of data
-- delimited by single quotes

SELECT   'Fred',
         'Flintstone'

-- the *type* of data is *character* data
-- more specifically, a *sequence of characters*

-- in programming, they would be called *strings*

-- another term might be *string literals*

-- now look at this:
SELECT      123   AS [numeric data type],
            '123' AS [character data type]

-- the first column is a number
-- the second column is a sequence of characters
--    first the character '1', then the character '2', ...

-- now consider this

SELECT       1   + 5  AS [numeric],
            '1' + '5' AS [character]
             
-- see how the plus sign means something different
-- depending on what's on either side?

-- with character data, the plus sign means *concatenate*
-- or 'put together'

-- so the basic operator for character expressions
-- is the plus sign

-- we can prove this:
SELECT      SQL_VARIANT_PROPERTY(1 + 5, 'BaseType'),
            SQL_VARIANT_PROPERTY('1' + '5', 'BaseType')

-- here'a another example:
SELECT   'Fred' + ' ' + 'Flintstone' AS [Full Name]

-------------------------------------------------------------
-- there's lots more we want to do with
-- character data
-- but we need functions....
-- here are a few simple ones:

SELECT LEN  (' Fred'),     -- find how many chars - note the spaces
       UPPER(' Fred'),     -- convert all chars to upper case
       LOWER(' Fred'),     -- convert all chars to lower case
       LEFT (' Fred', 2),  -- return just the left-most 2 chars
       RIGHT(' Fred', 2),  -- return just the right-most 2 chars
       LTRIM(' Fred')      -- get rid of any space chars on left

-- you might also want to find the location of a certain character

SELECT   CHARINDEX(' ', 'Fred Flintstone') AS [location of space character] 

-- and also, you can combine these functions into a more complex expression

SELECT   'Fred Flintstone'       AS [original full name],
         RIGHT('Fred Flintstone',                    -- right-most 10 characters
                 LEN('Fred Flintstone')              -- the number 15
                 - CHARINDEX(' ', 'Fred Flintstone') -- the number 5
              )                  AS [last name only]

---------------------------------------------------------------------------------
-- what about data from a table - not literals

SELECT   ProductName,
         LEN(ProductName) AS [length of name],
         UPPER(ProductName) AS [uppercase],
         CHARINDEX(' ', ProductName),
         RIGHT(ProductName, LEN(ProductName) - CHARINDEX(' ', ProductName))
FROM     Products

---------------------------------------------------------------------------------
-- in summary
-- string literals are delimited by single quotes
-- plus sign means concatenate
-- functions help with other tasks

-- Database by Doug
-- Douglas Kline
-- 8/28/2018
-- Character Expressions


Primary Keys in Relational Databases

-- Database by Doug
-- Douglas Kline
-- 9/28/2018
-- Primary Keys

USE Northwind
-- note that I'm starting out with a 
-- modified version of the Northwind.Categories table
-- called DBBDCategories
-- for demo purposes

-- Primary Keys
-- a primary key has two purposes: 
---- to guarantee access to every record in a table
---- to uniquely distinguish a record from all other records in the table

-- this is done by requiring a primary key to be:
---- NOT NULL
---- UNIQUE across all records in that table

-- the NOT NULL makes sure that *at least one* column is NOT NULL
-- and that there is a WHERE clause that will match on every record
-- for example:
--  WHERE categoryID = 7

-- guarantees one of:
-- categories with an categoryID of 7
-- there is no record with an *unknown* ID that *might be* 7

-- the UNIQUE makes sure that you can make a request of records
-- that guarantees at most one record
-- for example:
--  WHERE categoryID = 7

-- guarantees one of:
-- a *single* record with the category with an categoryID of 7
-- no category - there is definitively no record with an categoryID of 7

--***************************************************
-- tables don't absolutely require a primary key
-- however, guaranteed, unique access is extremely helpful in many situations
-- (particularly Online Transaction Processing Systems /OLTPs)
-- AND, there can be significant performance benefits to having a PK 
-- (for certain operations)

-- Consider my somewhat modified DBBDCategories table from Northwind
SELECT *
FROM    DBBDCategories

-- I happen to know that CategoryID is the primary key of this table

-- but how would you know otherwise?

-- you could look in the object inspector on the left
-- expand the database, the tables, the Categories table, the Columns
-- note the key icon on the left of the CategoryID, and the 'PK' in parentheses
-- (demo)

-- another way would be to put the table into a diagram
-- note the yellow key icon next to the CategoryID
-- (demo)

-- Finally, a really definitive way would be to script out the table
-- and see the full definition of the table in SQL
-- (demo)

-- here is a slightly abbreviated/edited version of 
-- the SQL CREATE statment for the DBBDCategories table:

--CREATE TABLE [dbo].[DBBDCategories]
--(
-- [CategoryID] [int] NOT NULL,
-- [CategoryName] [nvarchar](15) NOT NULL,
-- [Description] [ntext] NULL,
-- [Picture] [image] NULL,
-- CONSTRAINT [PK_DBBDCategories] PRIMARY KEY CLUSTERED 
--   (
--    [CategoryID] ASC
--   )
--) 

-- note the NOT NULL for CategoryID 
-- this means that every insert and update will be checked to make sure 
-- that the CategoryID does not become NULL
-- let's try it
-- demo insert new NULL PK
-- demo update PK to a NULL

-- note the PRIMARY KEY specified as a CONSTRAINT
-- this means that every insert and update will be checked to make sure
-- that no two CategoryIDs, in the WHOLE TABLE, get duplicated
-- let's try it
-- demo insert duplicate PK
-- demo update PK to a duplicate value


-- let's review
-- we are guaranteed that no record in the categories table has an unknown CategoryID
-- we are guaranteed that no two records in the categories table have the same CategoryID

-- in other words
-- WHERE CategoryID = 8
-- guarantees to return either no record, or one record
-- and we are guaranteed that there is no record
-- with an *unknown* CategoryID which is actually 8

-- so what does a categoryID of 8 actually mean?
--....?
--....?





-- NOTHING!!

-- primary keys don't *need* to mean anything
-- they just have to be UNIQUE and NOT NULL

-- now let's look at the definition of 
-- the *actual* Categories table from the Northwind DB

--CREATE TABLE [dbo].[Categories]
--(
-- [CategoryID] [int] IDENTITY(1,1) NOT NULL,  -- ** note IDENTITY(1,1)
-- [CategoryName] [nvarchar](15) NOT NULL,
-- [Description] [ntext] NULL,
-- [Picture] [image] NULL,
-- CONSTRAINT [PK_Categories] PRIMARY KEY CLUSTERED -- ** note PRIMARY KEY
-- (
--  [CategoryID] ASC
-- )
--) ON [PRIMARY] 

-- note the PRIMARY KEY in place of UNIQUE
-- these are interchangeable, they mean the same thing

-- however, also note the IDENTITY(1,1)

-- this is an example of defining a *synthetic* primary key
-- we are relinquishing to the DB engine,
-- the task of providing  PK values for this column

-- specifically, the first PK value should be 1
-- the second PK value should be 1 + 1
-- the third PK value should be  2 + 1
-- the fourth PK value should be 3 + 1
-- in other words, begin with 1, and increment by 1 each time

-- this also means that NO ONE ELSE, besides the DB engine
-- can change the value of Categories.CategoryID

-- (demo) try to change a PK value
-- (demo) try to insert a PK value

-- also, IDENTITY values are never re-used
-- this guarantees uniqueness across the live table and all archives
-- for example, if we archive CategoryID = 7 (move to another place)
-- then end up reusing CategoryID = 7 for another record
-- those two records can be confused for each other

 -- (demo) create a new record, delete it, show next PK value, skipped

 -- why is this good?
 -- HUMANS might accidentally re-use archived PK values
 -- also, columns that *mean* something may need to be changed
 -- it does exactly what is needed, and no more
 -- internal to the db engine, this can significantly improve performance

-- in summary,
-- primary keys need to be UNIQUE and NOT NULL, and nothing more
-- this guarantees unique access to each record
-- *synthetic* keys can be helpful in keeping all key values UNIQUE

-- this is especially helpful in OLTP type systems

-- foreshadowing ...
-- primary keys also are referenced by FOREIGN KEYS
-- see the video on FOREIGN KEYS!

-- thanks for watching!

-- Database by Doug
-- Douglas Kline
-- 9/28/2018
-- Primary Keys

Foreign Keys in Relational Databases


-- Database by Doug
-- Douglas Kline
-- 10/1/2018
-- Foreign Keys

USE Northwind
-- Foreign Keys
-- a field in one table (table A)
-- that references a primary key value in another table (table B)

-- for example, the Categories and Products table from Northwind
-- (show diagram)

-- in this case, the foreign key is
-- Products.CategoryID, i.e., the CategoryID field in Products

-- from a practical usage standpoint
-- the foreign key is a "lookup"
-- consider this:

SELECT   ProductID,
         ProductName,
         CategoryID
FROM     Products

SELECT   CategoryID,
         CategoryName
FROM     Categories

-- note how I can use the Products.CategoryID
-- value to find the CategoryName of the right Category

-- note that Primary Keys are UNIQUE and NOT NULL
-- e.g., Products.ProductID, Categories.CategoryID
-- but Foreign Keys can be NOT UNIQUE and NULL
-- e.g., Products.CategoryID

-- in this situation, many products can be
-- in the same category
-- and a product could have an unknown category
-- (products can exist without a category)

-- in a sense, FKs define a "relationship"
-- between two records
-- in this case, each product record is related
-- to a category record
-- (or not, in the case of NULL FK)

-- to be able to "lookup" from one table to another
-- you just need two fields of similar data types

-- however, what if you want to make sure 
-- that the lookups values are always *good*
-- in other words, that values in
-- Products.CategoryID are actually over in
-- the Categories table?

-- this requirement, that FK values are valid
-- is called *referential integrity*

-- note that this is a constraint on
-- the Products.CategoryID field
-- it must be a valid value that
-- exists over in the Categories.CategoryID field

-- let's see it in the definition of Products

CREATE TABLE [dbo].[Products]
(
   [ProductID]    [int] IDENTITY(1,1)  NOT NULL,
 [ProductName]  [nvarchar](40)       NOT NULL,
 [CategoryID]   [int]                NULL
   -- other fields omitted for clarity
 CONSTRAINT [PK_Products] 
   PRIMARY KEY CLUSTERED 
   (
    [ProductID] ASC
   )
 CONSTRAINT [FK_Products_Categories]
   FOREIGN KEY
   (
      [CategoryID]
   )
   REFERENCES [dbo].[Categories] ([CategoryID])
) 

-- note the REFERENCES section
-- note the allowed NULL for CategoryID

-- this means that on any UPDATE or INSERT to Products
-- a Product record must always have a valid CategoryID
-- or a NULL value (unknown)

-- (demo this)

-- it also means that on any UPDATE or DELETE to Categories
-- all Product records must always have a valid CategoryID
-- or a NULL value (unknown)

-- (demo this)

-- If we have referential integrity for all Product records
-- then we can be confident that our JOINs between the two tables
-- provide valid results

SELECT   Products.ProductID,
         Products.ProductName,
         Products.CategoryID,
         Categories.CategoryID,
         Categories.CategoryName
FROM     Products 
   JOIN  Categories ON Products.CategoryID = Categories.CategoryID
ORDER BY Products.ProductID

-- JOINs are a way of doing multiple "lookups"
-- of FK values to PK values
-- but that's another video...

-- in summary,
-- foreign keys provide a "lookup" mechanism
-- for records in two different tables
-- this also defines how the two tables are "related"
-- *referential integrity* enforces "good" foreign keys
-- in other words, a FK should always have a valid value
-- FK values can repeat - be non-UNIQUE
-- FK values can be NULL - unknown

-- thanks for watching!

-- Database by Doug
-- Douglas Kline
-- 10/1/2018
-- Foreign Keys

Tuesday, August 21, 2018

The FROM clause in SQL





-- Database by Doug
-- Douglas Kline
-- 8/9/2018
-- FROM - setting the source of data

USE Northwind

-- might want to review the SELECT statement

/*************************************/

-- we can do SELECT statements like this to create a tabular result

SELECT      'Doug' AS [firstName],
            22     AS [age]

-- the data in this table is not stored anywhere
-- it was created in the SELECT statement
-- and placed in the tabular result
-- in the way we specified

-- but there's data in the database already
-- it's in a table

-- how do we get at it and display it?

-- we can do this

SELECT      ProductID,
            ProductName
FROM        Products      -- the important part!

-- the FROM statement sets the source of the data
-- Products is the name of a table
--   in the Northwind database 
--   (note the USE statement at the top of this code)

-- I also happen to know that there are fields
-- called ProductID and ProductName
-- in the Products table

-- notice what happens when I try to use
-- a column that doesn't exist in the table

SELECT      ProductID,
            ProductName,
            NonsenseField
FROM        Products  

-- intellisense underlines it in red
-- and when I run it, I get an "invalid column name" error

-- what do I do if I don't know what the columns are?

SELECT      *
FROM        Products

-- so what does the asterisk mean?

-- normally you put column names in the SELECT clause
-- when you put a * instead, it means "all columns"

-- **Important: SELECT * means ALL COLUMNS
-- ** it doesn't say anything about rows
-- ** which rows is defined by the WHERE clause
-- ** see the WHERE clause video

---------
-- so, let's type a new SELECT statement
/*
SELECT      ProductID,
            ProductName
FROM        Products  
*/
 -- writing SELECT first:
 -------------------------------
SELECT      ProductID,
            ProductName
FROM        Products
 ---------------------------------
 -- notice how the intellisense isn't very helpful

 -- now writing FROM first:
 -----------------------------------
 SELECT     ProductID,
            ProductName
 FROM       Products
 -----------------------------------
 -- notice how intellisense is helpful
 -- once I put the FROM statement in

-- remember how you can rename a column in the SELECT?
SELECT      ProductID   AS [ID],
            ProductName AS [Name]
FROM        Products 

-- you can also do this for the table names
SELECT      ProductID   AS [ID],
            ProductName AS [Name]
FROM        Products    AS [p]        -- important part

-- this is called an alias 
-- p is an alias for Products
-- you can use the AS statement
-- but the convention among professionals is normally to leave it out
-- like this:
SELECT      ProductID   AS [ID],
            ProductName AS [Name]
FROM        Products       [p]            -- Note no AS

-- note also that the square brackets are not necessary
-- you can do this:
SELECT      ProductID   AS ID,
            ProductName AS Name
FROM        Products       p

-- I like to use them to highlight the aliasing
-- and point out that [p] is a name that was made up
-- in the SELECT statement - it doesn't exist in the table

-- there are times that you MUST alias a table or column name
-- to achieve your objective 
-- (for example: same table in the FROM multiple times - more advanced)

-- it also has implications to "qualifying" a column name
-- here is the same query fully qualified

SELECT      Products.ProductID,
            Products.ProductName
FROM        Products  

-- the *not* fully qualified works
SELECT      ProductID,
            ProductName,
            Products.SupplierID
FROM        Products, Suppliers 

-- but relies on the db engine
-- to look in the Products table for the column names
-- if there are no ambiguities, everything is fine...
-- if the column names are not reserved words, everything is fine...

-- when you have multiple tables in the FROM (see JOIN video),
-- you'll NEED to fully qualify to resolve ambiguities
-- AND to write safe, high quality code

-- sometimes you'll see this:
SELECT      [p].ProductID,
            [p].ProductName
FROM        Products  [p]

-- which works fine
-- the products table is aliased to p
-- which forces me to use [p] to qualify the fields

-- this doesn't work:
SELECT      Products.ProductID,
            Products.ProductName
FROM        Products  [p]

-- once it is renamed, it doesn't exist anymore
-- in the context of this statement

-- also, note that many would write the above statement
-- without the square brackets:
SELECT      p.ProductID,
            p.ProductName
FROM        Products  p

-- this works fine
-- and it will sometimes save quite a bit of typing

-- however, it doesn't scale to many tables well
-- it becomes *very* difficult to understand

-- as your statements get bigger with more tables,
--  I think this is clearer
SELECT      Products.ProductID,
            Products.ProductName
FROM        Products

-- for me, code clarity trumps almost everything
-- I will type more so my code is very readable to others
-- (and myself, a month later!)

-- therefore,
-- I prefer to *avoid* aliasing tables
-- unless it is really necessary: 
-- for example, when there are multiple of the same table in a single

-- in summary,
-- FROM is to state the source table(s)
-- you can alias table names in the FROM
-- you can, and sometimes HAVE TO,
--   *fully qualify* your column names
--   by using this format: 
--   tableName.columnName 

-- thanks for watching!

-- Database by Doug
-- Douglas Kline
-- 8/9/2018
-- FROM - setting the source of data

Followers