Database by Doug: Obese Data

big data (noun) (Merriam-Webster): an accumulation of data that is too large and complex for processing by traditional database management tools

obese data (noun): an accumulation of data that is much larger and more complex than necessary

I made the second one up. After seeing lots of examples of obese databases.

In short, obese databases' effective size is much smaller than their actual size, e.g., a database that consumes 1TB of space, but has only 1GB of data. Further, the schema / architecture / model / framework / product / system / process is more complex than necessary.

Databases that have these problems:

they are slow
they could be a fraction of the size that they are
their growth rate is higher than it should be (GBs/day rather than Kbs/day)
their operations (backup, batch load, maintenance) take much longer than necessary
they are inflexible, e.g., not easily adaptable to solve a new problem, write a new report, add a new feature, etc.

All these problems above are leading people to think that they have big data, when they really have obese data.

Obese data might be viewed as a "database anti-pattern".

Things that contribute to obese data in the context of databases:

data types that are larger than they need to be

bigint that could be int
nchar that could be char
guid that could be int
datetime2 that could be date
char that could be varchar
char(50) that could be char(2)

unused columns
duplicate indexes
indexes that are never used
full-factorial indexing :-)

an index for every single column in the table
an index for every two-column combination in the table
etc.

data that should be archived
leftover staging data
redundant data
large numbers of NULL values
inappropriate schema

not "normalized" enough
too "normalized"

missing or inappropriate maintenance operations

hourly dbcc dbreindex to fillfactor 100% on a write-heavy table
fillfactor 10% on read-heavy table

Database by Doug

Thursday, January 28, 2016

Obese Data

No comments:

Post a Comment

Blog Archive

Followers

About Me