Module 4 Tutorial:
Normalization
Normalization
Normalization is based on the observation
that relations with certain properties are
more effective in inserting, updating and
deleting data than other sets of relations
containing the same data
Normalization is a multi-step process of
decomposing unsatisfactory "bad" relations
by breaking up their attributes into smaller
relations
Source of the following hospital example: Atre, S. Data Base: Structured Techniques for Design,
Performance, and Management.
Unnormalized Relations
First step in normalization is to convert the data into a
two-dimensional table
In unnormalized relations data can repeat within a
column
Unnormalized Relation
Patient # Surgeon # Surg. date Patient Name Patient Addr Surgeon Surgery Postop drug
Drug side effects
Gallstone
s removal;
Jan 1, 15 New St. Beth Little Kidney
145 1995; June New York, Michael stones Penicillin, rash
1111 311 12, 1995 John White NY Diamond removal none- none
Eye
Charles Cataract
Apr 5, Field removal
243 1994 May 10 Main St. Patricia Thrombos Tetracyclin Fever
1234 467 10, 1995 Mary Jones Rye, NY Gold is removal e none none
Dogwood
Lane Open
Jan 8, Harrison, David Heart Cephalosp
2345 189 1996 Charles Brown NY Rosen Surgery orin none
55 Boston
Post Road,
Nov 5, Chester, Cholecyst
4876 145 1995 Hal Kane CN Beth Little ectomy Demicillin none
Blind Brook Gallstone
May 10, Mamaronec s
5123 145 1995 Paul Kosher k, NY Beth Little Removal none none
Eye
Cornea
Replacem
Apr 5, Hilton Road ent Eye
1994 Dec Larchmont, Charles cataract Tetracyclin
6845 243 15, 1984 Ann Hood NY Field removal e Fever
First Normal Form
To move to First Normal Form a relation must contain
only atomic values at each row and column.
No repeating groups
A column or set of columns is called a Candidate Key when
its values can uniquely identify the row in the relation.
First Normal Form
Patient # Surgeon # Surgery DatePatient Name Patient Addr Surgeon Name Surgery Drug adminSide Effects
15 New St.
New York, Gallstone
1111 145 01-Jan-95 John White NY Beth Little s removal Penicillin rash
15 New St. Kidney
New York, Michael stones
1111 311 12-Jun-95 John White NY Diamond removal none none
Eye
10 Main St. Cataract Tetracyclin
1234 243 05-Apr-94 Mary Jones Rye, NY Charles Field removal e Fever
10 Main St. Thrombos
1234 467 10-May-95 Mary Jones Rye, NY Patricia Gold is removal none none
Dogwood
Lane Open
Charles Harrison, Heart Cephalosp
2345 189 08-Jan-96 Brown NY David Rosen Surgery orin none
55 Boston
Post Road,
Chester, Cholecyst
4876 145 05-Nov-95 Hal Kane CN Beth Little ectomy Demicillin none
Blind Brook Gallstone
Mamaronec s
5123 145 10-May-95 Paul Kosher k, NY Beth Little Removal none none
Eye
Hilton Road Cornea
Larchmont, Replacem Tetracyclin
6845 243 05-Apr-94 Ann Hood NY Charles Field ent e Fever
Hilton Road Eye
Larchmont, cataract
6845 243 15-Dec-84 Ann Hood NY Charles Field removal none none
1NF Storage Anomalies
Insertion: A new patient has not yet undergone surgery --
hence no surgeon # -- Since surgeon # is part of the key, we
cannot insert.
Insertion: If a surgeon is newly hired and has not operated
yet -- there will be no way to include that person in the
database.
Update: If a patient comes in for a new procedure, and has
moved, we need to change multiple address entries.
Deletion (type 1): Deleting a patient record may also delete
all info about a surgeon.
Deletion (type 2): When there are functional dependencies
(like side effects and drug) changing one item eliminates
other information.
Second Normal Form
A relation is said to be in Second Normal Form when
every non-key attribute is fully functionally dependent
on the primary key.
That is, every non-key attribute needs the full primary
key for unique identification
Why is this not in 2NF?
Patient # Surgeon # Surgery DatePatient Name Patient Addr Surgeon Name Surgery Drug adminSide Effects
15 New St.
New York, Gallstone
1111 145 01-Jan-95 John White NY Beth Little s removal Penicillin rash
15 New St. Kidney
New York, Michael stones
1111 311 12-Jun-95 John White NY Diamond removal none none
Eye
10 Main St. Cataract Tetracyclin
1234 243 05-Apr-94 Mary Jones Rye, NY Charles Field removal e Fever
10 Main St. Thrombos
1234 467 10-May-95 Mary Jones Rye, NY Patricia Gold is removal none none
Dogwood
Lane Open
Charles Harrison, Heart Cephalosp
2345 189 08-Jan-96 Brown NY David Rosen Surgery orin none
55 Boston
Post Road,
Chester, Cholecyst
4876 145 05-Nov-95 Hal Kane CN Beth Little ectomy Demicillin none
Blind Brook Gallstone
Mamaronec s
5123 145 10-May-95 Paul Kosher k, NY Beth Little Removal none none
Eye
Hilton Road Cornea
Larchmont, Replacem Tetracyclin
6845 243 05-Apr-94 Ann Hood NY Charles Field ent e Fever
Hilton Road Eye
Larchmont, cataract
6845 243 15-Dec-84 Ann Hood NY Charles Field removal none none
Second Normal Form
Patient # Patient Name Patient Address
15 New St. New
1111 John White York, NY
10 Main St. Rye,
1234 Mary Jones NY
Charles Dogwood Lane
2345 Brown Harrison, NY
55 Boston Post
4876 Hal Kane Road, Chester,
Blind Brook
5123 Paul Kosher Mamaroneck, NY
Hilton Road
6845 Ann Hood Larchmont, NY
Second Normal Form
Surgeon # Surgeon Name
145 Beth Little
189 David Rosen
243 Charles Field
311 Michael Diamond
467 Patricia Gold
Second Normal Form
Patient # Surgeon # Surgery Date Surgery Drug Admin Side Effects
Gallstones
1111 145 01-Jan-95 removal
Kidney Penicillin rash
stones
1111 311 12-Jun-95 removal none none
Eye Cataract
1234 243 05-Apr-94 removal Tetracycline Fever
Thrombosis
1234 467 10-May-95 removal none none
Open Heart Cephalospori
2345 189 08-Jan-96 Surgery n none
Cholecystect
4876 145 05-Nov-95 omy Demicillin none
Gallstones
5123 145 10-May-95 Removal none none
Eye cataract
6845 243 15-Dec-84 removal none none
Eye Cornea
6845 243 05-Apr-94 Replacement Tetracycline Fever
1NF Storage Anomalies Removed
Insertion: Can now enter new patients
without surgery.
Insertion: Can now enter Surgeons who
have not operated.
Deletion (type 1): If Charles Brown
dies, the corresponding tuples from
Patient and Surgery tables can be
deleted without losing information on
David Rosen.
Update: If John White comes in for
third time, and has moved, we only
need to change the Patient table
2NF Storage Anomalies
Insertion: Cannot enter the fact that a
particular drug has a particular side
effect unless it is given to a patient.
Deletion: If John White receives some
other drug because of the penicillin
rash, and a new drug and side effect
are entered, we lose the information
that penicillin can cause a rash
Update: If drug side effects change (a
new formula) we have to update
multiple occurrences of side effects.
Third Normal Form
A relation is said to be in Third Normal
Form if there is no transitive functional
dependency between non-key attributes
When one non-key attribute can be determined
with one or more non-key attributes there is
said to be a transitive functional dependency.
The side effect column in the Surgery
table is determined by the drug
administered
Side effect is transitively functionally
dependent on drug so Surgery is not 3NF
Why is this not in 3NF?
Patient # Surgeon # Surgery Date Surgery Drug Admin Side Effects
Gallstones
1111 145 01-Jan-95 removal
Kidney Penicillin rash
stones
1111 311 12-Jun-95 removal none none
Eye Cataract
1234 243 05-Apr-94 removal Tetracycline Fever
Thrombosis
1234 467 10-May-95 removal none none
Open Heart Cephalospori
2345 189 08-Jan-96 Surgery n none
Cholecystect
4876 145 05-Nov-95 omy Demicillin none
Gallstones
5123 145 10-May-95 Removal none none
Eye cataract
6845 243 15-Dec-84 removal none none
Eye Cornea
6845 243 05-Apr-94 Replacement Tetracycline Fever
Third Normal Form
Patient # Surgeon # Surgery Date Surgery Drug Admin
1111 145 01-Jan-95 Gallstones removal Penicillin
Kidney stones
1111 311 12-Jun-95 removal none
1234 243 05-Apr-94 Eye Cataract removal Tetracycline
1234 467 10-May-95 Thrombosis removal none
2345 189 08-Jan-96 Open Heart Surgery Cephalosporin
4876 145 05-Nov-95 Cholecystectomy Demicillin
5123 145 10-May-95 Gallstones Removal none
6845 243 15-Dec-84 Eye cataract removal none
Eye Cornea
6845 243 05-Apr-94 Replacement Tetracycline
Third Normal Form
Drug Admin Side Effects
Cephalosporin none
Demicillin none
none none
Penicillin rash
Tetracycline Fever
2NF Storage Anomalies Removed
Insertion: We can now enter the fact that a particular
drug has a particular side effect in the Drug relation.
Deletion: If John White receives some other drug as a
result of the rash from penicillin, the information on
penicillin and rash is maintained.
Update: The side effects for each drug appear only
once.