It's incomplete, but covers basics, and gets a beginner started.
Any fool can write code that a computer can understand. Good programmers write code that humans can understand.
~Martin Fowler
Programs must be written for people to read, and only incidentally for machines to execute.-- formatting SQL
~Harold Abelson
-- version 2, with fixed formatting
-- Any fool can write code that a computer can understand.
-- Good programmers write code that humans can understand.
-- ~Martin Fowler
-- Programs must be written for people to read, and only incidentally for machines to execute.
-- ~Harold Abelson
-- here's a slightly long rant/explanation
-- of my personal coding style
-- So here is something a company might want to know
-- how much of each product did we sell in 1997
-- plus which category and supplier
-- fairly simple and small relative to real-world queries
-- this is produced by this query, directly from the SSMS query designer
SELECT Products.ProductID, Products.ProductName, Categories.CategoryName, Suppliers.CompanyName, SUM(([Order Details].Quantity * [Order Details].UnitPrice)
* (1.0 - [Order Details].Discount)) AS totalRevenue
FROM [Order Details] INNER JOIN
Products ON [Order Details].ProductID = Products.ProductID INNER JOIN
Orders ON [Order Details].OrderID = Orders.OrderID LEFT OUTER JOIN
Categories ON Products.CategoryID = Categories.CategoryID LEFT OUTER JOIN
Suppliers ON Products.SupplierID = Suppliers.SupplierID
GROUP BY Products.ProductID, Products.ProductName, Categories.CategoryName, Suppliers.CompanyName, YEAR(Orders.OrderDate)
HAVING (YEAR(Orders.OrderDate) = 1997)
ORDER BY Products.ProductID
-- here are a few questions,
-- can you answer them with a quick scan of the code?
---- which tables are involved?
---- what are the foreign keys for the joins?
---- how many columns will come back?
---- which columns?
---- any outer joins?
---- how many joins?
-- all of the above questions should be *easily* readable from the code
-- but they aren't
-- so here's some formatting modifications to help
-----------------------
-- vertically align a single item from each list on its own line:
-- --lists are anything with commas. They are typically in the SELECT,
-- --GROUP BY, and ORDER BY
-- --this makes the elements in the list clear.
-- --I don't have to depend on spotting commas for list item separation
SELECT Products.ProductID,
Products.ProductName,
Categories.CategoryName,
Suppliers.CompanyName,
SUM(([Order Details].Quantity * [Order Details].UnitPrice) * (1.0 - [Order Details].Discount)) AS totalRevenue
FROM [Order Details] INNER JOIN
Products ON [Order Details].ProductID = Products.ProductID INNER JOIN
Orders ON [Order Details].OrderID = Orders.OrderID LEFT OUTER JOIN
Categories ON Products.CategoryID = Categories.CategoryID LEFT OUTER JOIN
Suppliers ON Products.SupplierID = Suppliers.SupplierID
GROUP BY Products.ProductID,
Products.ProductName,
Categories.CategoryName,
Suppliers.CompanyName,
YEAR(Orders.OrderDate)
HAVING (YEAR(Orders.OrderDate) = 1997)
ORDER BY Products.ProductID
-- vertically align the tables in the FROM clause, and lead each with its JOIN
---- for me, this brings the really critical information about the type of joins to the front
---- I won't miss an outer join by accident
SELECT Products.ProductID,
Products.ProductName,
Categories.CategoryName,
Suppliers.CompanyName,
SUM(([Order Details].Quantity * [Order Details].UnitPrice) * (1.0 - [Order Details].Discount)) AS totalRevenue
FROM [Order Details]
INNER JOIN Products ON [Order Details].ProductID = Products.ProductID
INNER JOIN Orders ON [Order Details].OrderID = Orders.OrderID
LEFT OUTER JOIN Categories ON Products.CategoryID = Categories.CategoryID
LEFT OUTER JOIN Suppliers ON Products.SupplierID = Suppliers.SupplierID
GROUP BY Products.ProductID,
Products.ProductName,
Categories.CategoryName,
Suppliers.CompanyName,
YEAR(Orders.OrderDate)
HAVING (YEAR(Orders.OrderDate) = 1997)
ORDER BY Products.ProductID
-- remove verbose JOIN syntax
---- In general, extra words don't help readability
---- all important information about the type of join is retained
---- also vertically align components of each clause
---- with this, the overall structure/components
---- of the query is/are very apparent
SELECT Products.ProductID,
Products.ProductName,
Categories.CategoryName,
Suppliers.CompanyName,
SUM(([Order Details].Quantity * [Order Details].UnitPrice) * (1.0 - [Order Details].Discount)) AS totalRevenue
FROM [Order Details]
JOIN Products ON [Order Details].ProductID = Products.ProductID
JOIN Orders ON [Order Details].OrderID = Orders.OrderID
LEFT JOIN Categories ON Products.CategoryID = Categories.CategoryID
LEFT JOIN Suppliers ON Products.SupplierID = Suppliers.SupplierID
GROUP BY Products.ProductID,
Products.ProductName,
Categories.CategoryName,
Suppliers.CompanyName,
YEAR(Orders.OrderDate)
HAVING (YEAR(Orders.OrderDate) = 1997)
ORDER BY Products.ProductID
-- align ON clause elements, and space to the right
---- the tables involved become more clear
---- the "details" of the join are less important at first glance,
---- but are clearly presented if needed
SELECT Products.ProductID,
Products.ProductName,
Categories.CategoryName,
Suppliers.CompanyName,
SUM(([Order Details].Quantity * [Order Details].UnitPrice) * (1.0 - [Order Details].Discount)) AS totalRevenue
FROM [Order Details]
JOIN Products ON [Order Details].ProductID = Products.ProductID
JOIN Orders ON [Order Details].OrderID = Orders.OrderID
LEFT JOIN Categories ON Products.CategoryID = Categories.CategoryID
LEFT JOIN Suppliers ON Products.SupplierID = Suppliers.SupplierID
GROUP BY Products.ProductID,
Products.ProductName,
Categories.CategoryName,
Suppliers.CompanyName,
YEAR(Orders.OrderDate)
HAVING (YEAR(Orders.OrderDate) = 1997)
ORDER BY Products.ProductID
-- break long formulas by the operators
---- note also the separation of parentheses with a space
---- and alignment of the outer beginning and ending parentheses
---- this is judgement call material - every formula is different
---- you may choose to do this differently
---- but clearly, formatting of longer calculations helps readability
SELECT Products.ProductID,
Products.ProductName,
Categories.CategoryName,
Suppliers.CompanyName,
SUM( ( [Order Details].Quantity
* [Order Details].UnitPrice)
* (1.0 - [Order Details].Discount)
) AS totalRevenue
FROM [Order Details]
JOIN Products ON [Order Details].ProductID = Products.ProductID
JOIN Orders ON [Order Details].OrderID = Orders.OrderID
LEFT JOIN Categories ON Products.CategoryID = Categories.CategoryID
LEFT JOIN Suppliers ON Products.SupplierID = Suppliers.SupplierID
GROUP BY Products.ProductID,
Products.ProductName,
Categories.CategoryName,
Suppliers.CompanyName,
YEAR(Orders.OrderDate)
HAVING (YEAR(Orders.OrderDate) = 1997)
ORDER BY Products.ProductID
-- move the condition from the HAVING to the WHERE clause
-- and remove verbose/unnecessary parentheses
---- condition in HAVING or WHERE is logically equivalent, but should absolutely be changed
---- this is subtly, but immensely stylistically better
---- because it conveys the fact that YEAR(orderdate) is
---- a property of each Order record, not of each product group
---- WHERE is for conditions on records
---- HAVING is for conditions on groups
---- It is also potentially performance-improving, because
---- records will be eliminated prior to grouping, not after
---- hopefully the query optimizer catches this,
---- but why not make sure instead of hope?
SELECT Products.ProductID,
Products.ProductName,
Categories.CategoryName,
Suppliers.CompanyName,
SUM( ( [Order Details].Quantity
* [Order Details].UnitPrice)
* (1.0 - [Order Details].Discount)
) AS totalRevenue
FROM [Order Details]
JOIN Products ON [Order Details].ProductID = Products.ProductID
JOIN Orders ON [Order Details].OrderID = Orders.OrderID
LEFT JOIN Categories ON Products.CategoryID = Categories.CategoryID
LEFT JOIN Suppliers ON Products.SupplierID = Suppliers.SupplierID
WHERE YEAR(Orders.OrderDate) = 1997
GROUP BY Products.ProductID,
Products.ProductName,
Categories.CategoryName,
Suppliers.CompanyName,
YEAR(Orders.OrderDate)
ORDER BY Products.ProductID
-- the above I would say is far superior to the original
-- I can, with a quick scan, answer all the important questions I had above
-- small/typo errors become apparent
-- people new to the data model and codebase can read it
-- I think the above is sufficient, but some would go further
-- these changes I think improve the readability, but may not be worth the time/effort
-- essentially, this is further "pretty-printing" of the code
-- I think it helps in some places, but hurts in others
-- to me, the code looks "choppy", and certain expression don't appear cohesive
-- so because I can't *consistently* do this, I choose not to
SELECT Products .ProductID,
Products .ProductName,
Categories .CategoryName,
Suppliers .CompanyName,
SUM( ( [Order Details].Quantity
* [Order Details].UnitPrice)
* (1.0 - [Order Details].Discount)
) AS totalRevenue
FROM [Order Details]
JOIN Products ON [Order Details].ProductID = Products .ProductID
JOIN Orders ON [Order Details].OrderID = Orders .OrderID
LEFT JOIN Categories ON Products .CategoryID = Categories .CategoryID
LEFT JOIN Suppliers ON Products .SupplierID = Suppliers .SupplierID
WHERE YEAR(Orders.OrderDate) = 1997
GROUP BY Products .ProductID,
Products .ProductName,
Categories .CategoryName,
Suppliers .CompanyName,
YEAR(Orders.OrderDate)
ORDER BY Products .ProductID
-- some people will also use table aliases,
-- effectively creating a short "variable name" for each table
-- notice in the example below that I've still been careful with alignment
-- all table aliases are aligned vertially, so I
-- have an easy reference for each "variable name"
SELECT p.ProductID,
p.ProductName,
c.CategoryName,
s.CompanyName,
SUM( ( od.Quantity
* od.UnitPrice)
* (1.0 - od.Discount)
) AS totalRevenue
FROM [Order Details] od
JOIN Products p ON od.ProductID = p.ProductID
JOIN Orders o ON od.OrderID = o.OrderID
LEFT JOIN Categories c ON p.CategoryID = c.CategoryID
LEFT JOIN Suppliers s ON p.SupplierID = s.SupplierID
WHERE YEAR(o.OrderDate) = 1997
GROUP BY p.ProductID,
p.ProductName,
c.CategoryName,
s.CompanyName,
YEAR(o.OrderDate)
ORDER BY p.ProductID
-- the main benefits in readability, to me,
-- is fewer characters
-- and some vertical alignments are easier,
-- since each "variable name" is a single letter
-- it also has benefits in coding, by making "auto-complete" quicker
-- however,
-- there is additional cognitive overhead in translating
-- the "variable name" to the table
-- this gets worse the more tables are involved
-- and if you have 6 tables that all begin with p, you have a problem
-- and if you are not intimately familiar with the database/codebase,
-- your cognitive overhead is very high
-- so, can other people read the code? (see quotes at top)
--
-- personally, I'm not a fan of this
-- even with many years (*gasp* decades) of coding
-- when I drop into a new environment/datamodel/codebase
-- it takes me too long to adapt to their "variable names"