100% found this document useful (2 votes)
172 views13 pages

Crack-Smoking Data Models

The document discusses rules for data normalization and challenges that arise when applying theory to real-world systems. While normalized data models aim to avoid redundancy and inconsistencies, real systems cannot contain all data points and require some redundant or de-normalized data for performance, flexibility, or manageability. The document advocates practical approaches like modeling what customers actually have, handling exceptions manually, and redesigning models to match development platforms over strict adherence to theoretical normal forms.

Uploaded by

newmedio-devteam
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
172 views13 pages

Crack-Smoking Data Models

The document discusses rules for data normalization and challenges that arise when applying theory to real-world systems. While normalized data models aim to avoid redundancy and inconsistencies, real systems cannot contain all data points and require some redundant or de-normalized data for performance, flexibility, or manageability. The document advocates practical approaches like modeling what customers actually have, handling exceptions manually, and redesigning models to match development platforms over strict adherence to theoretical normal forms.

Uploaded by

newmedio-devteam
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 13

Crack-Smoking

Data Models
Data Models

Theory Meets Reality


Rules of Normalization
(Simplified)

(Simplified)
Make sure every table has a primary key (best to have a single value)

1NF - Put any repeating groups into their own table

• Don’t have a column with a comma-separated list of values or multiple


columns to represent the same value multiple times

• 2NF/3NF - Put any information that is not dependent on the primary key
in its own table

If you have a table of employees (id, position, salary, name) and their
salary is fully determined by their position, move that into a separate
table of position/salary.

• 4NF/5NF - Make sure all three-way and above joins are indeed valid for
3-way, and don’t need further separation.
Rules of Normalization
(Simplified)
(Simplified)
• ORNF - The data model contains only elemental facts

• DKNF - The data model fully defines all constraints and is free from
“update anomalies” (the data model prevents any logical inconsistencies)

• Key - uniquely defines a tuple

Constraint - rule governing values of attributes

• In DKNF keys fully define tuples, and constraints fully define logical
relationships / allowed values, including multi-table constraints
Rules of Normalization
(Simplified)

(Simplified)
JBNF - Don’t Smoke Crack While Doing Data Models!

JBNF2 - Don’t Do Data Models For Customers Who Smoke Crack!

• Okay, I break this one a lot

• But seriously, most business rules/decisions are not based on whether


or not it contributes to a logical data model

And why should it?

• On the other end, when making a data model, we need to realize the
flexibility that businesses require, so they don’t have to re-make the
data model after every business decision

Nor should they have to think too much about the data model when
making business decisions
The Goal
• Normalized Data Models

Everything can be managed via standard operations

• No Redundant Data / No Calculated Columns

• Strong locality -- don’t have to worry if you forgot to set something

Strict Constraints Enforced

• Pushes the business logic back to the data model so it can be easily
managed outside of the application logic - no “update anomalies”

• Data is modeled as data - fewer text fields

• All data can be managed and verified


The Reality
• No system can contain all data points
This by itself leads to conflicts with theory
• Not all information is available
• Summary columns have to be managed which
summarizes data inside and outside the
database
• Some business rules are too
complicated/flexible to be modeled, and must
be abbreviated with flags and/or text fields
The Reality

• System performance demands redundant data

Many summaries are too complicated to be


recalculated each time
• Some of this can be mitigated with functional
indices

Some summaries may need to be altered based


on data exterior to the database
The Reality
• Some data is best stored de-normalized
• Management Issues- Do we really need a table
for that?
Performance Issues- Do we really want to
query for that?
• Time Issues - Do we really want to build the
interface to manage that?
• In some cases, maybe we can de-normalize our
database to save some sanity.
The Reality

• Some data models look whacked, because the


data they are modeling is whacked.
• In an ideal world, we would encourage the
customer to come up with a more consistent way
to manage themselves.
• But usually we just model what they have
because it’s easier than changing 20 years of
tradition and infrastructure
Case Study -
Homebuilder
• Builder has several divisions, each division is
responsible for an area (kind of - some are on top
of each other)

Builder has several brands


• Builder categorizes houses and communities by
lifestyle

Builder also needs to track home plans and


inventory
JB’s Rules of
Practicalization
• 1PF - Design a well-normalized database (the
level is up to you) which describes their data as
you understand it.
• 2PF - If they way that the customer talks about
their data is inconsistent, develop a vocabulary to
use when talking to them about their project
which matches the data model. Be sure to clarify
any unclear statements they make using the new
vocabulary.
JB’s Rules of
Practicalization
• 3PF - Determine which business rules are too
fungible to be implemented well by the database,
and instead make manual processes for dealing
with exceptions using flags and text fields
• 4PF - Determine if some values may have
exceptions based on incomplete information in
the database, and create user-maintainable
columns or tables for them
JB’s Rules for
Practicalization
• 5PF - Rejigger your data model so that it matches
your development platform nicely.

6PF - Create calculated columns based on real or


anticipated performance problems. Be sure
there are application-level measures taken to
keep these mostly consistent.
• 7PF - ?

You might also like