Database by Doug

Thursday, October 4, 2018

String Expressions in SQL

-- Database by Doug
-- Douglas Kline
-- 8/28/2018
-- Character Expressions - expressions that return character data

-- beginner level
-- might want to check out numeric expressions
------------------------------------------------------------------
-- consider this statement

SELECT   'Fred',
         'Flintstone'

-- note that the single quotes are not shown in
-- the table results
-- this is because that are not *part of* the data
-- they are used to mark the beginning and end of the data
-- and are called *delimiters*

-- so the statement has two pieces of data
-- delimited by single quotes

SELECT   'Fred',
         'Flintstone'

-- the *type* of data is *character* data
-- more specifically, a *sequence of characters*

-- in programming, they would be called *strings*

-- another term might be *string literals*

-- now look at this:
SELECT      123   AS [numeric data type],
            '123' AS [character data type]

-- the first column is a number
-- the second column is a sequence of characters
--    first the character '1', then the character '2', ...

-- now consider this

SELECT       1   + 5  AS [numeric],
            '1' + '5' AS [character]
             
-- see how the plus sign means something different
-- depending on what's on either side?

-- with character data, the plus sign means *concatenate*
-- or 'put together'

-- so the basic operator for character expressions
-- is the plus sign

-- we can prove this:
SELECT      SQL_VARIANT_PROPERTY(1 + 5, 'BaseType'),
            SQL_VARIANT_PROPERTY('1' + '5', 'BaseType')

-- here'a another example:
SELECT   'Fred' + ' ' + 'Flintstone' AS [Full Name]

-------------------------------------------------------------
-- there's lots more we want to do with
-- character data
-- but we need functions....
-- here are a few simple ones:

SELECT LEN  (' Fred'),     -- find how many chars - note the spaces
       UPPER(' Fred'),     -- convert all chars to upper case
       LOWER(' Fred'),     -- convert all chars to lower case
       LEFT (' Fred', 2),  -- return just the left-most 2 chars
       RIGHT(' Fred', 2),  -- return just the right-most 2 chars
       LTRIM(' Fred')      -- get rid of any space chars on left

-- you might also want to find the location of a certain character

SELECT   CHARINDEX(' ', 'Fred Flintstone') AS [location of space character] 

-- and also, you can combine these functions into a more complex expression

SELECT   'Fred Flintstone'       AS [original full name],
         RIGHT('Fred Flintstone',                    -- right-most 10 characters
                 LEN('Fred Flintstone')              -- the number 15
                 - CHARINDEX(' ', 'Fred Flintstone') -- the number 5
              )                  AS [last name only]

---------------------------------------------------------------------------------
-- what about data from a table - not literals

SELECT   ProductName,
         LEN(ProductName) AS [length of name],
         UPPER(ProductName) AS [uppercase],
         CHARINDEX(' ', ProductName),
         RIGHT(ProductName, LEN(ProductName) - CHARINDEX(' ', ProductName))
FROM     Products

---------------------------------------------------------------------------------
-- in summary
-- string literals are delimited by single quotes
-- plus sign means concatenate
-- functions help with other tasks

-- Database by Doug
-- Douglas Kline
-- 8/28/2018
-- Character Expressions

Primary Keys in Relational Databases

-- Database by Doug
-- Douglas Kline
-- 9/28/2018
-- Primary Keys

USE Northwind
-- note that I'm starting out with a 
-- modified version of the Northwind.Categories table
-- called DBBDCategories
-- for demo purposes

-- Primary Keys
-- a primary key has two purposes: 
---- to guarantee access to every record in a table
---- to uniquely distinguish a record from all other records in the table

-- this is done by requiring a primary key to be:
---- NOT NULL
---- UNIQUE across all records in that table

-- the NOT NULL makes sure that *at least one* column is NOT NULL
-- and that there is a WHERE clause that will match on every record
-- for example:
--  WHERE categoryID = 7

-- guarantees one of:
-- categories with an categoryID of 7
-- there is no record with an *unknown* ID that *might be* 7

-- the UNIQUE makes sure that you can make a request of records
-- that guarantees at most one record
-- for example:
--  WHERE categoryID = 7

-- guarantees one of:
-- a *single* record with the category with an categoryID of 7
-- no category - there is definitively no record with an categoryID of 7

--***************************************************
-- tables don't absolutely require a primary key
-- however, guaranteed, unique access is extremely helpful in many situations
-- (particularly Online Transaction Processing Systems /OLTPs)
-- AND, there can be significant performance benefits to having a PK 
-- (for certain operations)

-- Consider my somewhat modified DBBDCategories table from Northwind
SELECT *
FROM    DBBDCategories

-- I happen to know that CategoryID is the primary key of this table

-- but how would you know otherwise?

-- you could look in the object inspector on the left
-- expand the database, the tables, the Categories table, the Columns
-- note the key icon on the left of the CategoryID, and the 'PK' in parentheses
-- (demo)

-- another way would be to put the table into a diagram
-- note the yellow key icon next to the CategoryID
-- (demo)

-- Finally, a really definitive way would be to script out the table
-- and see the full definition of the table in SQL
-- (demo)

-- here is a slightly abbreviated/edited version of 
-- the SQL CREATE statment for the DBBDCategories table:

--CREATE TABLE [dbo].[DBBDCategories]
--(
-- [CategoryID] [int] NOT NULL,
-- [CategoryName] [nvarchar](15) NOT NULL,
-- [Description] [ntext] NULL,
-- [Picture] [image] NULL,
-- CONSTRAINT [PK_DBBDCategories] PRIMARY KEY CLUSTERED 
--   (
--    [CategoryID] ASC
--   )
--) 

-- note the NOT NULL for CategoryID 
-- this means that every insert and update will be checked to make sure 
-- that the CategoryID does not become NULL
-- let's try it
-- demo insert new NULL PK
-- demo update PK to a NULL

-- note the PRIMARY KEY specified as a CONSTRAINT
-- this means that every insert and update will be checked to make sure
-- that no two CategoryIDs, in the WHOLE TABLE, get duplicated
-- let's try it
-- demo insert duplicate PK
-- demo update PK to a duplicate value


-- let's review
-- we are guaranteed that no record in the categories table has an unknown CategoryID
-- we are guaranteed that no two records in the categories table have the same CategoryID

-- in other words
-- WHERE CategoryID = 8
-- guarantees to return either no record, or one record
-- and we are guaranteed that there is no record
-- with an *unknown* CategoryID which is actually 8

-- so what does a categoryID of 8 actually mean?
--....?
--....?





-- NOTHING!!

-- primary keys don't *need* to mean anything
-- they just have to be UNIQUE and NOT NULL

-- now let's look at the definition of 
-- the *actual* Categories table from the Northwind DB

--CREATE TABLE [dbo].[Categories]
--(
-- [CategoryID] [int] IDENTITY(1,1) NOT NULL,  -- ** note IDENTITY(1,1)
-- [CategoryName] [nvarchar](15) NOT NULL,
-- [Description] [ntext] NULL,
-- [Picture] [image] NULL,
-- CONSTRAINT [PK_Categories] PRIMARY KEY CLUSTERED -- ** note PRIMARY KEY
-- (
--  [CategoryID] ASC
-- )
--) ON [PRIMARY] 

-- note the PRIMARY KEY in place of UNIQUE
-- these are interchangeable, they mean the same thing

-- however, also note the IDENTITY(1,1)

-- this is an example of defining a *synthetic* primary key
-- we are relinquishing to the DB engine,
-- the task of providing  PK values for this column

-- specifically, the first PK value should be 1
-- the second PK value should be 1 + 1
-- the third PK value should be  2 + 1
-- the fourth PK value should be 3 + 1
-- in other words, begin with 1, and increment by 1 each time

-- this also means that NO ONE ELSE, besides the DB engine
-- can change the value of Categories.CategoryID

-- (demo) try to change a PK value
-- (demo) try to insert a PK value

-- also, IDENTITY values are never re-used
-- this guarantees uniqueness across the live table and all archives
-- for example, if we archive CategoryID = 7 (move to another place)
-- then end up reusing CategoryID = 7 for another record
-- those two records can be confused for each other

 -- (demo) create a new record, delete it, show next PK value, skipped

 -- why is this good?
 -- HUMANS might accidentally re-use archived PK values
 -- also, columns that *mean* something may need to be changed
 -- it does exactly what is needed, and no more
 -- internal to the db engine, this can significantly improve performance

-- in summary,
-- primary keys need to be UNIQUE and NOT NULL, and nothing more
-- this guarantees unique access to each record
-- *synthetic* keys can be helpful in keeping all key values UNIQUE

-- this is especially helpful in OLTP type systems

-- foreshadowing ...
-- primary keys also are referenced by FOREIGN KEYS
-- see the video on FOREIGN KEYS!

-- thanks for watching!

-- Database by Doug
-- Douglas Kline
-- 9/28/2018
-- Primary Keys

Foreign Keys in Relational Databases

-- Database by Doug
-- Douglas Kline
-- 10/1/2018
-- Foreign Keys

USE Northwind
-- Foreign Keys
-- a field in one table (table A)
-- that references a primary key value in another table (table B)

-- for example, the Categories and Products table from Northwind
-- (show diagram)

-- in this case, the foreign key is
-- Products.CategoryID, i.e., the CategoryID field in Products

-- from a practical usage standpoint
-- the foreign key is a "lookup"
-- consider this:

SELECT   ProductID,
         ProductName,
         CategoryID
FROM     Products

SELECT   CategoryID,
         CategoryName
FROM     Categories

-- note how I can use the Products.CategoryID
-- value to find the CategoryName of the right Category

-- note that Primary Keys are UNIQUE and NOT NULL
-- e.g., Products.ProductID, Categories.CategoryID
-- but Foreign Keys can be NOT UNIQUE and NULL
-- e.g., Products.CategoryID

-- in this situation, many products can be
-- in the same category
-- and a product could have an unknown category
-- (products can exist without a category)

-- in a sense, FKs define a "relationship"
-- between two records
-- in this case, each product record is related
-- to a category record
-- (or not, in the case of NULL FK)

-- to be able to "lookup" from one table to another
-- you just need two fields of similar data types

-- however, what if you want to make sure 
-- that the lookups values are always *good*
-- in other words, that values in
-- Products.CategoryID are actually over in
-- the Categories table?

-- this requirement, that FK values are valid
-- is called *referential integrity*

-- note that this is a constraint on
-- the Products.CategoryID field
-- it must be a valid value that
-- exists over in the Categories.CategoryID field

-- let's see it in the definition of Products

CREATE TABLE [dbo].[Products]
(
   [ProductID]    [int] IDENTITY(1,1)  NOT NULL,
 [ProductName]  [nvarchar](40)       NOT NULL,
 [CategoryID]   [int]                NULL
   -- other fields omitted for clarity
 CONSTRAINT [PK_Products] 
   PRIMARY KEY CLUSTERED 
   (
    [ProductID] ASC
   )
 CONSTRAINT [FK_Products_Categories]
   FOREIGN KEY
   (
      [CategoryID]
   )
   REFERENCES [dbo].[Categories] ([CategoryID])
) 

-- note the REFERENCES section
-- note the allowed NULL for CategoryID

-- this means that on any UPDATE or INSERT to Products
-- a Product record must always have a valid CategoryID
-- or a NULL value (unknown)

-- (demo this)

-- it also means that on any UPDATE or DELETE to Categories
-- all Product records must always have a valid CategoryID
-- or a NULL value (unknown)

-- (demo this)

-- If we have referential integrity for all Product records
-- then we can be confident that our JOINs between the two tables
-- provide valid results

SELECT   Products.ProductID,
         Products.ProductName,
         Products.CategoryID,
         Categories.CategoryID,
         Categories.CategoryName
FROM     Products 
   JOIN  Categories ON Products.CategoryID = Categories.CategoryID
ORDER BY Products.ProductID

-- JOINs are a way of doing multiple "lookups"
-- of FK values to PK values
-- but that's another video...

-- in summary,
-- foreign keys provide a "lookup" mechanism
-- for records in two different tables
-- this also defines how the two tables are "related"
-- *referential integrity* enforces "good" foreign keys
-- in other words, a FK should always have a valid value
-- FK values can repeat - be non-UNIQUE
-- FK values can be NULL - unknown

-- thanks for watching!

-- Database by Doug
-- Douglas Kline
-- 10/1/2018
-- Foreign Keys

Tuesday, August 21, 2018

The FROM clause in SQL

-- Database by Doug
-- Douglas Kline
-- 8/9/2018
-- FROM - setting the source of data

USE Northwind

-- might want to review the SELECT statement

/*************************************/

-- we can do SELECT statements like this to create a tabular result

SELECT      'Doug' AS [firstName],
            22     AS [age]

-- the data in this table is not stored anywhere
-- it was created in the SELECT statement
-- and placed in the tabular result
-- in the way we specified

-- but there's data in the database already
-- it's in a table

-- how do we get at it and display it?

-- we can do this

SELECT      ProductID,
            ProductName
FROM        Products      -- the important part!

-- the FROM statement sets the source of the data
-- Products is the name of a table
--   in the Northwind database 
--   (note the USE statement at the top of this code)

-- I also happen to know that there are fields
-- called ProductID and ProductName
-- in the Products table

-- notice what happens when I try to use
-- a column that doesn't exist in the table

SELECT      ProductID,
            ProductName,
            NonsenseField
FROM        Products  

-- intellisense underlines it in red
-- and when I run it, I get an "invalid column name" error

-- what do I do if I don't know what the columns are?

SELECT      *
FROM        Products

-- so what does the asterisk mean?

-- normally you put column names in the SELECT clause
-- when you put a * instead, it means "all columns"

-- **Important: SELECT * means ALL COLUMNS
-- ** it doesn't say anything about rows
-- ** which rows is defined by the WHERE clause
-- ** see the WHERE clause video

---------
-- so, let's type a new SELECT statement
/*
SELECT      ProductID,
            ProductName
FROM        Products  
*/
 -- writing SELECT first:
 -------------------------------
SELECT      ProductID,
            ProductName
FROM        Products
 ---------------------------------
 -- notice how the intellisense isn't very helpful

 -- now writing FROM first:
 -----------------------------------
 SELECT     ProductID,
            ProductName
 FROM       Products
 -----------------------------------
 -- notice how intellisense is helpful
 -- once I put the FROM statement in

-- remember how you can rename a column in the SELECT?
SELECT      ProductID   AS [ID],
            ProductName AS [Name]
FROM        Products 

-- you can also do this for the table names
SELECT      ProductID   AS [ID],
            ProductName AS [Name]
FROM        Products    AS [p]        -- important part

-- this is called an alias 
-- p is an alias for Products
-- you can use the AS statement
-- but the convention among professionals is normally to leave it out
-- like this:
SELECT      ProductID   AS [ID],
            ProductName AS [Name]
FROM        Products       [p]            -- Note no AS

-- note also that the square brackets are not necessary
-- you can do this:
SELECT      ProductID   AS ID,
            ProductName AS Name
FROM        Products       p

-- I like to use them to highlight the aliasing
-- and point out that [p] is a name that was made up
-- in the SELECT statement - it doesn't exist in the table

-- there are times that you MUST alias a table or column name
-- to achieve your objective 
-- (for example: same table in the FROM multiple times - more advanced)

-- it also has implications to "qualifying" a column name
-- here is the same query fully qualified

SELECT      Products.ProductID,
            Products.ProductName
FROM        Products  

-- the *not* fully qualified works
SELECT      ProductID,
            ProductName,
            Products.SupplierID
FROM        Products, Suppliers 

-- but relies on the db engine
-- to look in the Products table for the column names
-- if there are no ambiguities, everything is fine...
-- if the column names are not reserved words, everything is fine...

-- when you have multiple tables in the FROM (see JOIN video),
-- you'll NEED to fully qualify to resolve ambiguities
-- AND to write safe, high quality code

-- sometimes you'll see this:
SELECT      [p].ProductID,
            [p].ProductName
FROM        Products  [p]

-- which works fine
-- the products table is aliased to p
-- which forces me to use [p] to qualify the fields

-- this doesn't work:
SELECT      Products.ProductID,
            Products.ProductName
FROM        Products  [p]

-- once it is renamed, it doesn't exist anymore
-- in the context of this statement

-- also, note that many would write the above statement
-- without the square brackets:
SELECT      p.ProductID,
            p.ProductName
FROM        Products  p

-- this works fine
-- and it will sometimes save quite a bit of typing

-- however, it doesn't scale to many tables well
-- it becomes *very* difficult to understand

-- as your statements get bigger with more tables,
--  I think this is clearer
SELECT      Products.ProductID,
            Products.ProductName
FROM        Products

-- for me, code clarity trumps almost everything
-- I will type more so my code is very readable to others
-- (and myself, a month later!)

-- therefore,
-- I prefer to *avoid* aliasing tables
-- unless it is really necessary: 
-- for example, when there are multiple of the same table in a single

-- in summary,
-- FROM is to state the source table(s)
-- you can alias table names in the FROM
-- you can, and sometimes HAVE TO,
--   *fully qualify* your column names
--   by using this format: 
--   tableName.columnName 

-- thanks for watching!

-- Database by Doug
-- Douglas Kline
-- 8/9/2018
-- FROM - setting the source of data

Database by Doug

Thursday, October 4, 2018

String Expressions in SQL

Primary Keys in Relational Databases

Foreign Keys in Relational Databases

Tuesday, August 21, 2018

The FROM clause in SQL

Blog Archive

Followers

About Me