Database by Doug: 2014

Wednesday, October 1, 2014

Self-Referential Join Example on Northwind Database

In the Microsoft Northwind database, there is an example of a self-referential table, Employee (see data model).

Let's look at some of the data with a simple query:

SELECT EmployeeID,
LastName,
ReportsTo
FROM Employees
ORDER BY EmployeeID

Which gives the data shown.

So you can see that EmployeeID=6 with the LastName=Suyama reports to EmployeeID=5 with the LastName=Buchanan.

So how do we get this all in one table?

The primary constraint here is that you cannot use the same table name twice in the same FROM statement. So, we need to use a table alias.

SELECT Employees.EmployeeID,
Employees.LastName,
Employees.ReportsTo,
Boss.EmployeeID,
Boss.LastName
FROM Employees
JOIN Employees Boss ON Employees.reportsTo = Boss.EmployeeID

ORDER BY Employees.EmployeeID

Which gives this:

The FROM statement uses the Employees table twice, but one is aliased to be Boss. Based on that alias, the correct join is to use Employees.reportsTo as the foreign key, and Boss.EmployeeID as the primary key; because Employees report to Bosses.

Here's how that join looks in the Query Designer.

Regardless of how the database engine actually accomplishes the results, it's good to view this as two complete copies of the (same) table. They both have exactly the same fields and all the same records.

Wednesday, July 30, 2014

Reorganizing a Large Index from the SQL Management Studio GUI

It takes a ...long... time.

Just tried reorganizing the clustered index on a table with about 5.5 GB of data in it. I'm at about 90 minutes and counting...

Rebuilding the clustered index and two non-clustered indexes (1.3GB and 2.5GB) through the command line took about 10 minutes.

Mine is sort of a diabolical situation, but still.

I have a table with lots of rows, and the average record is about 110 bytes. So there are about 73 (8060/110, where 8060 is the non-header portion of a data page) records on a data page. I removed an unused variable character field, which should save 2 bytes per record.

Strangely enough, a DBCC SHOWCONTIG before and after dropping this field shows exactly the same thing. I had to think about that for a minute, but it makes sense. SQL Server just removes the field from the schema, but makes no actual changes to the underlying data. So it happens very fast. And DBCC SHOWCONTIG is likely calculating its results from sys.allocation_units and sys.partitions, which would not necessarily get updated from a schema change.

By removing my one varchar field, the average record size is now 108 bytes, which would allow about 74 records on each data page. The table in question happens to have 700,000+ pages in it. And each page has very low fragmentation.

If I were doing a REBUILD, SQL Server would essentially copy the entire table into a new area, then swap it into the old table's place. By default, this is an offline operation.

However, a REORGANIZE is an online operation, and is done in-place. Each leaf-level page is defragmented, then more records are added, if possible. So, in my situation, it will do something like this:

go to first page
defrag it
go get some records from the next page to fill up this page
go to next page
if the page is empty, de-allocate it and go to next page
go to step 2

The good news about this, is that it is an online operation, and can be stopped at any time without a huge rollback.

The bad news is that in my situation, SQL Server has a lot of work to do because every record is fragmented - 2 bytes have been removed. Which means every page is fragmented. And every page will need some new records to fill it up.

Not sure how much of this is the GUI, and how much is the REORGANIZE. But unfortunately, I can't seem to stop the REORGANIZE from the GUI.

Also, it's nice to be able to see the elapsed seconds in the lower right corner of the screen when you do things from the command line. With the GUI, you can't see how long it is taking.

Hopefully, this will be done by tomorrow morning.

Monday, July 21, 2014

Doug's first mead part 4

See part 1, part 2, and part 3.

Doug's first mead Part 4

A couple of updates on the mead.

About two weeks ago, I racked the mead into another fermenter, added another pack of yeast, and the last of the nutrient/energizer combination.

Last night, I racked it again into a fermenter. Didn't add anything. Left a lot of yeast sediment at the bottom, and also left a lot of strawberry seeds in the bucket.

The mead looks fantastic. It's a pale pink with gold overtones.

It tastes funky. Somewhat sour. Definitely more of a champagne or wine taste rather than beer.

I think next time, I will not add another pack of yeast.

Sunday, June 29, 2014

Doug's first mead part 3

(see Part 1 and Part 2)

Yesterday

Yesterday at about 8pm, I started working a little on the mead.

Here was the status when I started:

honey and water mixture (about 2 gallons) was in the fermenter - it had been heated to 160F yesterday, and is now at room temperature
strawberry and water mixture (about 2 gallons)was in a second container - campden was added yesterday, and has had more than 24 hours to off-gas. It's also at room temperature.

I mixed a batch of Yeast Nutrient and Yeast Energizer in a 2-to-1 ratio. I used 3 tsp of Nutrient, and 1 & 1/2 tsp of Energizer. I mixed all 4 & 1/2 tsp in a little sterilized jar for use over the fermentation process.

I sterilized my thief, spoon, and hydrometer.

I mixed 2 & 1/2 tsp of pectic enzyme into the strawberry mixture, and let it sit for an hour. The directions said 1/2 tsp per gallon of must, added one hour before pitching the yeast. I figured I'd add it to just the strawberries and let it get going before I combined the honey and strawberries.

While the strawberries were resting, I took two whole dry vanilla beans, sliced them lengthwise, scraped out the seeds, and added them to about a cup of water in a small pot. I brought the water to just a simmer then let cool.

After the hour, I stirred up the strawberries and poured them into the fermenter with the honey. The strawberries still smell very fresh. I also noticed that a lot of the little seeds stayed in the bottom of the container, and didn't get added to the honey. I'm thinking that was a good thing, so maybe pre-processing the strawberries in a separate container was a good idea.

I had about 4 gallons of must at this point. I added some spring water to my vanilla pan to cool it more, then poured it into the must. I rinsed out the vanilla pan a little with some more water. Then I added water to bring it up to 5 gallons.

I'm using Harris-Teeter (our local grocery) brand Spring Water in gallon jugs. I did not sterilize or heat this water. I'm assuming it is sufficiently sterile out of the jug.

I stirred everything up real good, and took a hydrometer reading: 1.10.

I then added 1 packet of dry Red Star Wine Yeast, and 1 tsp of my yeast nutrient/energizer mixture directly to the must, and stirred well. Then I snapped on the fermenter bucket lid, added the air lock, and set in my dining room. The dining room is in the middle of the house, and probably has the most stable temperature in the house. We keep the house about 72F.

Today

Today at about noon, I added another dose (1 tsp) of the Yeast Nutrient/Energizer mixture and stirred vigorously.

I didn't see any bubbling through the air lock, but I definitely noticed fizzing as I mixed in the nutrient/energizer. Also, I noticed a difference in the odor when I opened the lid. Yesterday it was just strawberry and honey scent, today it smells more like sourdough.

Here's what it looks like now:

My friend did not give an initial gravity reading, but he did say he was shooting for 1.01 at the end. If I can get down to 1.01, then the final alcohol should be about 11% (13.0% - 2.0%).

Friday, June 27, 2014

Doug's first mead, part 2

(see part 1)

Today

I put the honey in the fermenter and prepped the strawberries today.

Trying to keep better notes on my processes and ingredients, so here's all the details.

The picture below shows 4 quarts of honey (1 gallon or about 12 pounds), along with a 6 pound bag of frozen strawberries from Costco.

Ingredients

I prepped the strawberries by blending them with water. They were still partially frozen, so they didn't blend easily. I filled up the blender, and then added water to about the level of the top of the strawberries, the whizzed them up. It took maybe 5 blender fulls to do all the strawberries.

Strawberries in blender

I was shooting for 2 gallons of strawberry puree. In the last, smaller batch, I added two campden tablets to the blender, i.e., 1 tablet per gallon. The puree and campden all went into a large food prep container I got from the kitchen supply store. Then I added spring water to bring it up to two gallons, and stirred it all up. Tin foil over the top, and need to let it sit for 24 hours.

Puree with campden

Got to talk quite a bit with John at Wilmington Homebrew Supply today about the mead. He suggested the campden for the fruit, rather than heating to retain the aroma.

After the puree was done, I put the 4 quarts of honey into a pot on the stove, and used some hot water to rinse out each quart jar. I added enough water to maybe double the volume - about 2 gallons. I stirred it very thoroughly, and heated the honey/water mixture to 160 degrees fahrenheit. The consistency was more like a thick beer wort rather than a thin syrup. I let the mixture sit at this temp for about 25 minutes.

The honey mixture went into my sterilized fermenting bucket. Snapped the lid on with the air lock. That will sit until the puree is ready.

Tomorrow

I plan to use pectic enzymes, so I did some research and found this interesting article (scroll down for English). Based on this article, I'm going to wait 24 hours for the campden to off-gas, then add the enzymes to the puree and let them work for 24 hours. Then I'll add them to the honey, top off to 5 gallons with spring water, then add the first yeast nutrient and pitch the yeast.

Also going to do a "staggered nutrient addition". Based on this article, I'm going to:

one dose with the yeast
one dose 24 hours after fermentation begins
one dose 48 hours after fermentation begins
one dose 7 days after fermentation begins ( or about 30% of fermentation left, based on gravity)

Doug's first Mead, part 1

Getting ready to make mean for the first time.

I'll be using a recipe from my friend Andrew Keener:

Recipe

Ingredients:

1 gallon (12 lbs) of honey.
6lbs of fresh strawberries.
Red star wine yeast.
Pectic Enzyme to treat 5 gallons.
Yeast nutrient if you want to... not required.

Steps:

Bring about 3 gallons of water up to 160 degrees Fahrenheit in a 6+ gallon pot. Dump all the honey in and cover. Let it hang out for 20 minutes or so. While waiting, puree the strawberries or chop them finely. Put them in a pot and bring it up to 156 degree Fahrenheit stirring occasionally. Cover and let sit for 20 minutes.

Dump honey and strawberries into a 5 gallon fermenter and top off with water to 5 gallons. Pitch yeast when it comes down to an appropriate temperature (for wine yeasts this is usually 80 degress fahrenheit).

Let it ferment for roughly 2 weeks or until it bubbles once every 90 seconds. Then transfer to secondary. The strawberries will have mostly disintegrated so I try to suck up as few of them as possible when transferring.

Then let it sit for another month. If you take gravity reading throughout I usually finish when the gravity gets down to 1.010.

When I bottle I add 3/4 cup of corn sugar. Alternatively you can add 1 cup of honey but I would steep the honey at 156 degrees fahrenheit for 20 minutes in a little hot water before adding it.

My Changes to the Recipe

I'm excited that i have some good Wilmington suburban honey from a friend. I has hoping to make this during strawberry season and use hand picked local strawberries, but I've missed the season.

I have a couple of changes I'm going to make to the process/recipe:

I'm going to use Campden on the strawberries, rather than heat them. This is on the recommendation from John at Wilmington Homebrew Supply. The idea is that the fruit is more sensitive to heat, and it will change the flavor.
I'm going to try "Staggered Nutrient Additions" as described in this BrewingTV Recipe by Curt Stock. The idea here is to make sure the yeast is very healthy and vigorous.
I may also try some additional yeast pitchings. Can't remember where I saw this, but it seems that the yeast can fall to the bottom, or become inactive. Re-pitching makes sure that the desired yeast remains dominant.
I'm going to add two vanilla beans to the honey at the very beginning when I'm heating the honey. I love a little vanilla sugar on strawberries. I'll split the beans, scrape the seeds in, and add the whole beans. I've used vanilla beans in other recipes - they tolerate heat well, and their flavor is alcohol-soluble.

Going to do most of this today, and will try to keep detailed notes as I go.

(see part 2)

Tuesday, February 11, 2014

Why shouldn't I defragment my SSD drive?

Many years ago, I read a book called Hard Disk Smarts: Everything You Need to Choose and Use Your Hard Disk.

Great book that helped me understand how a hard disk works, and how to organize my data in files and folders. Basic stuff that most kids know by age 10 now. But in 1988, people were still figuring that stuff out - this book was really helpful to me. It explained what was physically going on inside the hard drive.

It was the first time I'd heard about maintaining your Hard Drive (HD) performance by running a defragmentation utility.

Defragmentation (defragging) has become a standard maintenance operation on hard drives, and it's something I automatically set up on machines that I build. I'm sure I'm not alone.

But, I'm going to have to change.

I recently bought a Chromebook with 16GB of solid state (SS) storage, and no hard drive. (I love this machine, but that's another post). The operating system comes from SS, and I'm meant to keep my data files in the cloud. It boots extremely fast, and I am rarely waiting for data from the cloud. In other words, this Chromebook feels faster than any other machine I have.

The speedy feel is due to the solid state storage. It's not the CPU. It's not the amount of RAM. I have other machines that have more/better of both, and still feel slower. It's the hard drive.

Solid State Drives (SSDs) are generally faster than Hard Disk Drives (HDDs). But why not defragment?

Defragmentation does two main things:

It moves data to "fast" parts of the HDD
It consolidates file fragments

It turns out that neither of these things makes any sense on an SSD, and can actually be harmful.

To understand why, let's look at how HDDs work.

HDDs have spinning platters in them (see video below). In other words, there are moving parts.

Here's a video showing clearly the multiple platters on an old-school drive:

In order to read or write from a platter, the read/write head must move to the right position on the spinning disk (access time), then wait until the appropriate spot on the disk spins to the position under the read/write head (rotational latency time). Then, all the magnetic spots on the platters have to travel under the read/write head.

These two physical operations take time. And because these are physical operations we can affect how long each one takes.

So what do we mean by "fast" parts of the disk? In short, the outside edges of the platters, and/or near the head park location.

Each platter is "lined" from center to edge, creating circular regions called tracks. Then each track is sliced into pieces that contain data called sectors. Because the outside track is longer than the inside track, there are typically more sectors on the outside track. The actuator arms (the arms that move the read/write heads across the platters) move the read/write head from track to track, then sectors are read as they spin under the read/write head.

To reduce access time we could put data in a spot on the platter that does not require the actuator arm to move very far. In other words, put data on tracks close together. In the videos above, notice that read/write heads begin "parked" near the spindle at the center of the platter(s). To reduce access time, we could position data near the center of platters. Then the actuator arms do not have to move as far, so the access time is reduced.

Once the read/write head is in the right spot, it must read the magnetic sectors on the disk as they pass under the read/write head. Because there are more sectors on the outside tracks, more data can be read/written in one revolution of the platter. So, outside edges are "faster".

And what do we mean by "consolidate file fragments".

Files commonly get split into multiple pieces and stored in fragments on a drive. This usually happens when a file is too big to fit into an open space on a disk. Over time, this can happen to many files, and each file can be split up into many pieces (tens, hundreds, or even thousands of pieces.)

So, if a user want to open a file that is actually split into two pieces on the hard drive, the activities are:

access the track where fragment 1 is located (access time)
wait for platter to rotate the first sector of fragment 1 under the read/write head (rotational latency)
allow all sectors of fragment 1 to pass under the read/write head (read time)
access the track where fragment 2 is located (access time)
wait for platter to rotate the first sector of fragment 2 under the read/write head (rotational latency)
allow all sectors of fragment 2 to pass under the read/write head (read time)

If the file was all in one piece, items 4 and 5 would not have to be done! Imagine a file in a thousand pieces - that's a lot of extra work!

Alright, so fragmentation on a HDD can make it take longer to read/write data files. And to speed things up, we can de-fragment, which puts all the file fragments together, and puts them on fast parts of the platters.

But why not do that with SSDs?

Well, SSDs do not have spinning platters, or moving actuator arms - they have no moving parts. They consist of solid-state chips accessed electronically. This means that access times are the same for any storage location in the SSD. There are no fast locations and there are no slow locations.

Also, because the time to get to a piece of data does not depend on the location, having file fragments physically next to each other means nothing. Said another way, accessing sectors 1 and 2 is no faster or slower than accessing locations 1 and 50000. Proximity means nothing.

OK, so consolidating files doesn't help on an SSD. And there are no fast/slow sectors in an SSD. But why not defragment anyway?

Well, solid state memory chips actually have a limited life cycle, meaning that they wear out. Writing to the sector actually wears it out a little. So moving data around in a defragmentation operation actually reduces the life of your SSD.

So:

defragment your HDDs - it will help performance
DON'T defragement your SDDs - it won't help, and it actually hurts

Database by Doug

Wednesday, October 1, 2014

Self-Referential Join Example on Northwind Database

Self-Referential Join Example on Northwind Database

Wednesday, July 30, 2014

Reorganizing a Large Index from the SQL Management Studio GUI

Reorganizing a Large Index from the SQL Management Studio GUI

Monday, July 21, 2014

Doug's first mead part 4

Doug's first mead Part 4

Sunday, June 29, 2014

Doug's first mead part 3

Yesterday

Today

Friday, June 27, 2014

Doug's first mead, part 2

Today

Tomorrow

Doug's first Mead, part 1

Recipe

Ingredients:

Steps:

My Changes to the Recipe

Tuesday, February 11, 2014

Why shouldn't I defragment my SSD drive?

Blog Archive

Followers

About Me