A short video covering the JOINs within a SQL SELECT statement.
For beginners. How to show columns from more than one table in a single SELECT statement. Covers joining with the WHERE clause and the JOIN statement, and some best practices.Here is the SQL that goes with the demo.
-- Database by Doug -- Douglas Kline -- 10/1/2015 -- Introduction to JOINs USE Northwind -- suppose we want to look at our products -- and what categories they are in SELECT * FROM Products -- focus more on just a few columns SELECT ProductID, ProductName, CategoryID FROM Products ORDER BY ProductID -- so what category is CategoryID 1? 4? SELECT * FROM Categories -- focus in on the categoryName SELECT CategoryID, CategoryName FROM Categories -- great, categoryID=1 is 'Beverages' -- categoryID=4 is 'Dairy Products' -- we used the CategoryID field in the Products table -- to "look up" the name of the Category -- but we don't want to do that for lots of records SELECT ProductID, ProductName, CategoryID FROM Products ORDER BY ProductID -- why not just add the CategoryName field to the SELECT clause? SELECT ProductID, ProductName, CategoryID, CategoryName FROM Products ORDER BY ProductID -- CategoryName is not a field in the Products table -- OK, let's add the Categories table to the FROM statement SELECT ProductID, ProductName, CategoryID, CategoryName FROM Products, Categories ORDER BY ProductID -- that got rid of the syntax error on CategoryName -- but now CategoryID is ambiguous?? -- look at the columns list for each table -- there's a CategoryID in both the Products table -- and the Products table -- so, we need to be more specific - which one do we mean -- here's how SELECT ProductID, ProductName, Products.CategoryID, CategoryName FROM Products, Categories ORDER BY ProductID -- great, now we don't have any syntax errors -- but we have 702 rows? -- where is that coming from? SELECT COUNT(CategoryID) AS [CategoryCount] FROM Categories -- 9 categories SELECT COUNT(ProductID) AS [ProductCount] FROM Products -- 78 products SELECT 9 * 78 -- 702 - just like the number of records SELECT ProductID, ProductName, Products.CategoryID, CategoryName FROM Products, Categories ORDER BY ProductID -- let's look closer -- adding the categoryID from the Categories table SELECT ProductID, ProductName, Products.CategoryID, Categories.CategoryID, CategoryName FROM Products, Categories ORDER BY ProductID -- that's odd, why is chai being shown next to the wrong category? -- in fact, it's being shown next to *every* category -- all products are shown next to every category -- because we haven't stated how the products and categories should be related -- the database engine returns all possible combinations of products and categories -- this is called a CROSS JOIN -- this is sometime what we want, but not usually -- so , how do we limit the returned records below -- so that only the ones with matching categoryIDs are kept? -- using the WHERE clause -- SELECT ProductID, ProductName, Products.CategoryID, Categories.CategoryID, CategoryName FROM Products, Categories WHERE Products.CategoryID = Categories.CategoryID -- ******** ORDER BY ProductID -- this looks a lot better -- chai has a categoryID = 1, and it is shown next to the category with categoryID = 1 -- but the record count is 77? -- aren't there 78 products? -- looking at the products again... SELECT ProductID, ProductName, CategoryID FROM Products ORDER BY ProductID -- scrolling down, we can see that Dougs Mustard has a NULL categoryID -- and looking back our combined query SELECT ProductID, ProductName, Products.CategoryID, Categories.CategoryID, CategoryName FROM Products, Categories WHERE Products.CategoryID = Categories.CategoryID -- ******** ORDER BY ProductID -- scrolling down, note that Dougs Mustard doesn't appear -- because this Products.CategoryID = Categories.CategoryID -- is essentially this: NULL = Categories.CategoryID -- which is false (NULLs in another video) -- ok, looks like we have the logic right with this: SELECT ProductID, ProductName, Products.CategoryID, Categories.CategoryID, CategoryName FROM Products, Categories WHERE Products.CategoryID = Categories.CategoryID ORDER BY ProductID -- but humans don't really want to see the CAtegoryID -- so we can remove them SELECT ProductID, ProductName, CategoryName FROM Products, Categories WHERE Products.CategoryID = Categories.CategoryID ORDER BY ProductID -- the above is called a JOIN of the Products and Categories table -- the above is stated as a constraint in the WHERE clause -- but it is a special kind of a constraint -- between the Primary Key of one table, and the Foreign Key in another table -- so one way to look at a JOIN is as a constraint -- you make all possible combinations of the records from two tables -- then remove the ones that don't have equal PK--FK combinations -- another way to view a JOIN is as a specific combination of tables -- combine the tables in a specific way to create a specific set of records -- then show fields from the combined set of records -- this is done in the FROM clause SELECT ProductID, ProductName, CategoryName FROM Products JOIN Categories ON Products.CategoryID = Categories.CategoryID ORDER BY ProductID -- this is logically equivalent -- it will always return the same records as the WHERE clause version -- I like to format it like this: SELECT ProductID, ProductName, CategoryName FROM Products JOIN Categories ON Products.CategoryID = Categories.CategoryID ORDER BY ProductID -- I suggest you learn both ways -- so you can read either way when you encounter it -- also, sometimes one way is clearer than another in a certain context -- the JOIN syntax signals your *intention* better than the WHERE clause -- and generally organizes your statement better -- conceptually, it is an binary operator between two tables, -- with an ON clause to specify details that the operator will use -- it is *not* a list of tables (separated by commas), -- but an expression that states how to combine the tables -- one last thing -- safe SQL coding practice -- right now the statement below works fine -- it is syntactically correct SELECT ProductID, ProductName, CategoryName FROM Products JOIN Categories ON Products.CategoryID = Categories.CategoryID ORDER BY ProductID -- but what would happen if in the future -- someone put a CategoryName field in the Products table? -- then there would be a CategoryName field in both Products and Categories -- they *should* be able to do that without breaking code -- even if it doesn't make sense -- then categoryName would become ambiguous and the code would not execute -- so how do we prevent that possible future bug? -- by *qualifying* every field like this SELECT Products.ProductID, Products.ProductName, Categories.CategoryName FROM Products JOIN Categories ON Products.CategoryID = Categories.CategoryID ORDER BY Products.ProductID -- safe coding rule: -- whenever there is more than one table -- fully qualify every field in every clause: SELECT, FROM, WHERE, ORDER BY, etc. -- intro to JOIN...
No comments:
Post a Comment