Tuesday, May 28, 2024

Find NULL in any column




-- Database by Doug
-- Douglas Kline
-- May 23 2024
-- Find records that have a NULL in any field of the record

-- in other words, return all records that contain a NULL in any field

-- review:

-- NULL is an odd beast - it means "an unknown value"
-- unknown values are not equal to anything
-- unknown values are not "not equal" to anything
-- think of NULL as a system defined value that cannot be compared to anything

SELECT 1
WHERE '' = NULL
OR '' > NULL
OR '' < NULL
OR 0 = NULL
OR 0 < NULL
OR 0 > NULL
OR NULL = NULL
OR NULL <> NULL


-- let's try the special unary comparison operator, IS NULL

SELECT 1
WHERE (NULL) IS NULL

-- let's use the class Northwind Database
USE Northwind;

-- and see what's in the products table
SELECT *
FROM Products

-- notice that I've put a couple of records in so we have some
-- records with NULL values

SELECT *
FROM Products
WHERE ProductID > 77




SELECT *
FROM Products
WHERE CategoryID IS NULL
OR QuantityPerUnit IS NULL

-- results show Doug's Spicy Mustard, Doug's Bold Ketchup


-- but what if I wanted to test for a NULL in ANY column?

-- can I do this?

SELECT *
FROM Products
WHERE * IS NULL

-- kinda makes sense, but * can only be used in SELECT

-- sounds like maybe you could use the ANY operator modifier,
-- but unfortunately, ANY requires a *comparison*, e.g., =, <, >, <>, etc.
-- and NULL comparisons always return FALSE

-- and ANY operates across rows, on a single column
-- not across columns on a single row... :-(

-- here is how it would look for the Products table

SELECT *
FROM Products
WHERE ProductID IS NULL
OR ProductName IS NULL
OR SupplierID IS NULL
OR CategoryID IS NULL
OR QuantityPerUnit IS NULL
OR UnitPrice IS NULL
OR UnitsInStock IS NULL
OR UnitsOnOrder IS NULL
OR ReorderLevel IS NULL
OR Discontinued IS NULL

-- but this would be a bit tedious with hundreds of columns

-- but if my requirement is a straightforward static SQL statement,
-- I just need to find a good / fast way to build it

-- I can find a list of all columns from sys.columns

SELECT name, is_nullable
FROM sys.columns
WHERE object_id = OBJECT_ID('Products')
AND is_nullable = 1

-- this would be helpful in writing my WHERE clause with lots of ORs
-- adding some string concatenation

SELECT name
+ ' IS NULL'
FROM sys.columns
WHERE object_id = OBJECT_ID('Products')
AND is_nullable = 1

-- now let's put it all in one string using string_agg

SELECT STRING_AGG(
name
+ ' IS NULL',
' OR '
)
FROM sys.columns
WHERE object_id = OBJECT_ID('Products')
AND is_nullable = 1

-- and to be complete, let's add the rest of the SELECT

SELECT 'SELECT * FROM Products WHERE '
+ STRING_AGG(
name
+ ' IS NULL',
' OR '
)
FROM sys.columns
WHERE object_id = OBJECT_ID('Products')
AND is_nullable = 1

-- now copy and paste, and reformat
SELECT * FROM Products WHERE SupplierID IS NULL OR CategoryID IS NULL OR QuantityPerUnit IS NULL OR UnitPrice IS NULL OR UnitsInStock IS NULL OR UnitsOnOrder IS NULL OR ReorderLevel IS NULL
SELECT * FROM Products WHERE SupplierID IS NULL OR CategoryID IS NULL OR QuantityPerUnit IS NULL OR UnitPrice IS NULL OR UnitsInStock IS NULL OR UnitsOnOrder IS NULL OR ReorderLevel IS NULL

SELECT *
FROM Products
WHERE SupplierID IS NULL
OR CategoryID IS NULL
OR QuantityPerUnit IS NULL
OR UnitPrice IS NULL
OR UnitsInStock IS NULL
OR UnitsOnOrder IS NULL
OR ReorderLevel IS NULL

-- Summary
-- you need to find records that have a NULL in any column
-- you have lots of columns
-- you have to write a static SQL statement
-- use some SQL to write your statement
-- drawing the columns from sys.columns


-- Database by Doug
-- Douglas Kline
-- May 23 2024
-- Find records that have a NULL in any field of the record

Thursday, May 16, 2024

Azure SQL Edge Instance on a Macbook Pro M1 with Docker

This is an update to this post from 2020.



I work on a MacBook Pro M1 with a 16" display that has been a great workhorse for me for years.  Using a containerized image of Microsoft SQL Server makes it possible to run an entire SQL Server on my aging MacBook.

As in 2020, it's very handy to have a SQL instance on your local machine. I do this kind of stuff with a local sql database instance:

  • data modeling
  • demonstrations
  • development
  • quick proof of concept
  • sandbox development
  • disconnected work
The main difference in 2024 versus 2020, is that I'll be using Azure SQL Edge, rather than a full version of SQL Server. For most of my uses, this is sufficient. Besides, it requires less memory and disk space. 

The particular image we'll use is published by Microsoft, and can be found here on DockerHub. This capability to run SQL Server on Macbooks comes from Microsoft's SQL Server on Linux efforts. Thank you.

Be aware that you can get a free Azure SQL database in Microsoft's Azure cloud, which might server your needs better. For me, this has a slightly different set of capabilities, and would be inconvenient with a spotty wifi connection, or while completely disconnected. (me at a coffee shop)

I did this several years ago, and used my local SQL instance extensively. I was able to get a full version of SQL Server 2019 running in about 5 minutes. I followed Bob Ward's  Take the SQL Server Mac Challenge while I was listening to him speak at SQL Intersection in Las Vegas.

If you follow these instructions, it should be very easy. It wasn't as easy for me - I found out all the ways that don't work :-) I couldn't get a full SQL Server running, and I'm fairly certain it was because of Docker not fully supporting the M1 Apple Chip, even with latest version of Docker and Rosetta 2 emulation. 

Hopefully my experience will save you some time. At the end, I'll talk about the choice of Azure SQL Edge, and its limitations.

Instructions

Here's the overview:
  1. Get Docker
  2. Open Terminal
  3. Execute a long-ish docker command
  4. Verify via Docker Desktop
  5. Connect to the instance from a client in the host machine

1. Docker Installation

Head over to Docker and download the Docker Desktop for Mac. It offers both Intel and Mac chip versions of the product. I always have to think about this and look - my M1 has an Apple chip. I confirm this by clicking the Apple image in the upper left corner of my desktop and choosing About This Mac.

<brief gif demo here>

Get the stable version for whichever platform you are on. I used version 4.30.0.

Follow the straightforward instructions.

2. Open a terminal window

Open up a Mac terminal window. 

<brief gif demo here>

3. Run Docker Command

All the pieces of this command are necessary. This command fetches the container image and uses it to create a container, all in one command.

docker run -e "ACCEPT_EULA=Y" -e "MSSQL_SA_PASSWORD=ABCabc123" -p 1433:1433 --name sqledge --hostname sqledge -d mcr.microsoft.com/azure-sql-edge

To show what's going on, I'll break it down. Here's the basic command:

docker run mcr.microsoft.com/azure-sql-edge

This basic command will execute without errors, but will not do what we want without the other specified options. This basic command says "create a container from the image at this location". If you haven't already downloaded it, it will go fetch the image.

Here's the full command again, broken into multiple lines. Note the forward slash continuation character at the end of each line:

docker run /
   -e "ACCEPT_EULA=Y" /
   -e "MSSQL_SA_PASSWORD=ABCabc123" / 
   -p 1433:1433 /
   --name sqledge / 
   --hostname sqledge / 
   -d /
   mcr.microsoft.com/azure-sql-edge

You can see that the first and last line above contain the basic command. All the other lines are options. The "-e" options are options that are passed to the "environment", i.e., they are passed into the container. The other options are for Docker at the host level to say how we want the container set up.

-e "ACCEPT_EULA=Y"

This option indicates your acceptance of the End User License Agreement for Azure SQL Edge. You gotta accept to use it :-) Note that Docker ignores this option - it just passes it into the container.

-e "MSSQL_SA_PASSWORD=ABCabc123"

This option sets the password for the database's system administrator account. This will be how we log in to the database in the container. Choose whatever password you would like.

If you're not familiar with SQL Server, traditionally, a user with username "sa" was set up to be the "root" or "highly privileged" user of the database. In today's world with actual production servers, this is a sufficiently secure way to operate. But for a local non-production network-disconnected instance, this is sufficient. And we won't have to set up the tighter security mechanisms. Again, Docker ignores this option - it just passes it into the container.

-p 1433:1433

This option maps port 1433 inside the container to port 1433 in the host. 

Remember that there are two machines involved. The host (my Macbook running Docker), and the virtual machine (the container with Azure SQL Edge running in it.) Inside the container, Azure SQL Edge will be listening on port 1433 (that's the default port). Docker will connect the host's port 1433 to the client's port 1433. So if I connect a database client like Azure Data Studio to my Macbook's port 1433, I will be able to communicate between Azure Data Studio (in the host) and Azure SQL Edge (inside the container).

In a more complex setup, it is possible to set up multiple containers with SQL instances (or other servers) and map them at the host level with Docker to create a system of servers.

-name sqledge

This gives the container a Docker name. You can now refer to this container by name with Docker commands.

--hostname sqledge

This sets the hostname inside the container. Note that this does not change anything at the host level (in this case, my Macbook). 

You might ask, well isn't this an environment thing? Why doesn't this start with "-e"?

Well, sort of. All virtual machines will need a hostname, so Docker provides a standard way to set it. However, not all virtual machines will need an sa password. So items that are specific to a particular container image are passed as -e options.

-d

This tells Docker to run this as a "disconnected" virtual machine, which does not have a user interface or window. We won't see a window for this container on the host. It will be running in the background.

4. Verify via Docker Desktop

If you've run the command in the previous section, you should have a running container aka virtual machine based on the mcr.microsoft.com/azure-sql-edge image.

You should be able to see this listed in Docker Desktop.

And you should also be able to see it by running this command in the terminal.

docker ps

Which gives as output:

CONTAINER ID   IMAGE                              COMMAND                  CREATED STATUS       PORTS                              NAMES
79e76f16d903   mcr.microsoft.com/azure-sql-edge   "/opt/mssql/bin/perm…"   4 hours ago   Up 4 hours   1401/tcp, 0.0.0.0:1433->1433/tcp   sqledge

Which gives this info, reorganized:

CONTAINER ID 79e76f16d903
IMAGE        mcr.microsoft.com/azure-sql-edge
COMMAND      "/opt/mssql/bin/perm…" 
CREATED      4 hours ago
STATUS       Up 4 hours    
PORTS        1401/tcp, 0.0.0.0:1433->1433/tcp                         
NAMES        sqledge

5. Connect using a Database Client

I'll use Microsoft's Azure Data Studio (ADS), which runs on Windows, MacOS, and Linux. Here's the "new connection" dialog in ADS:
A couple of things to point out in this connection dialog:
  • the Server has a comma between the IP and the port, i.e., localhost, 1433
  • the Authentication type is SQL login, which would not be very secure for a production server
  • the User name is sa
  • the Password will need to be the password from the docker command. in our case "ABCabc123"
  • the Server Group is a convenience for ADS, which allows you to put connections in groups
  • the other entries have been left at the defaults
And here is a query that has been executed with ADS:

Just to be clear about what is going on:
  • Azure Data Studio (ADS) is running on the host, my Macbook
  • ADS is connected to port 1433 on the localhost IP address, localhost is my Macbook (localhost IP is typically 127.0.0.1, which works also)
  • Two-way communication is mapped from localhost:1433 to/from sqledge:1433
  • The SQL Server in the container is listening to/sending on port 1433

6. Shut it down

You can, of course, leave this container running in the background and use it whenever you need it. But it does continue to use memory and a bit of cpu, even when idle. 

You can stop and start the container using the Docker Desktop application. Or, you can issue these commands in a terminal window:

docker stop sqledge
docker stop sqledge

To completely remove the container from the host machine (it gets deleted from the hard drive) you can issue this command:

docker rm sqledge

Azure SQL Edge local vs SQL Server local vs Azure SQL Server 

I thought quite a bit about what I needed to do, and which setup(s) would do everything I need. Here's my thoughts on capabilities and effort with each of these setups.

My main concern with the free Azure SQL Server (managed) database is the security overhead and need for constant connectivity. It's just easier to use SQL Authentication than the authentication required by Azure/AWS. 

Here's what I like about a local Azure SQL Edge instance:
  • simple authentication
  • can create, start, stop, remove, clone containers easily
  • can do these things: 
    • all data definition language commands (create, alter, drop)
    • all data manipulation language commands (SELECT, INSERT, UPDATE, etc.)
    • all basic T-SQL commands
    • data stream
Here's what I can't do (from this source):
  • special data types such as HierarchyID, Spatial, Full Text, FileStreams, In Memory OLTP, JSON (limitations), etc.
  • special features such as Replication Polybase, Snapshots, Linked Servers, CLR, High Availability, etc.
  • auxiliary server product capabilities: Analysis Services, Reporting Services, etc.

Summary

I think this will serve my needs for doing basic demos, T-SQL programming, data modeling and proof-of-concept experimentation. I hope this is helpful to you!

Followers