0% found this document useful (0 votes)
52 views82 pages

DBMS 3.1

A **Database Management System (DBMS)** is software that provides an interface for interacting with databases. Its primary role is to store, retrieve, and manage data efficiently. The DBMS serves as an intermediary between the user and the database, allowing users to access and manipulate data using queries without needing to understand the underlying physical structure of the database. ### Key Components of a DBMS: 1. **Database Engine**: Responsible for data storage, retrieval, and updating.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views82 pages

DBMS 3.1

A **Database Management System (DBMS)** is software that provides an interface for interacting with databases. Its primary role is to store, retrieve, and manage data efficiently. The DBMS serves as an intermediary between the user and the database, allowing users to access and manipulate data using queries without needing to understand the underlying physical structure of the database. ### Key Components of a DBMS: 1. **Database Engine**: Responsible for data storage, retrieval, and updating.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Module 3

First Normal Form


• Defi ned to disallow multivalued attributes,
composite attributes, a n d their combinations
 It states that the domain of a n attribute must
include only atomic (simple, indivisible) values a n d
that the value of any attribute in a tuple must b e
a single value from the domain of that attribute
 1NF disallows relations within relations or relations
as attribute values within tuples
 The only attribute values permitted by
1NF are single
• atomic (or indivisible) values.
• Consider the DEPARTMENT relation schema shown in
Figure below

 Primary key is Dnum ber


 We assume that e a c h department c a n h a v e a number of
locations

 The DEPARTMENT s ch em a a n d a sample relation state are


shown in Figure below

As w e c a n see, this is no t in 1NF b e c a use Dlo c a tio ns is no t


an atomi c attribute, as illustrated by the first tuple in Figure
There are two ways w e c a n look at the Dlocations attribute:

• The domain of Dlocations contains atomic values, but


some tuples c a n have a set of these values. In this case,
Dlocations is not functionally dependent on the
primary key Dnumber

• The domain of Dlocations contains sets of values a n d


h e n c e is nonatomic. In this case, Dnumber→Dlocations
because e a c h set is considered a single member of the
attribute domain
In either case, the DEPARTMENT relation is not in 1NF
There are three main techniques to ach i ev e first normal form for such a

relation:

1. Remove the attribute Dlocations that violates 1NF a n d p l a c e it in a

separate relation DEPT_LOCATIONS along with the primary key

Dnumber of DEPARTMENT. The primary key of this relation is the

combination {Dnumber, Dlocation}. A distinct tuple in

DEPT_LOCATIONS exists for each location of a department. This

decomposes the non-1NF relation into two 1NF relations.

2. Expand the key so that there will b e a separate tuple

in the original DEPARTMENT relation for e a c h location of a

DEPARTMENT. In this case, the primary key b e c o m e s the combination

{Dnumber, Dlocation}. This solution has the disadvantage of

introducing redundancy in the relation


3. If a maximum number of values is known for the attribute—for
example, if it is known that at most three locations c a n exist for a
department— r e p l a c e the Dlocations attribute b y three atomic
attributes: Dlocation1, Dlocation2, a n d Dlocation3. This solution has the
disad van ta ge of introducing NULL values if most departments h a v e
fewer than three locations. Querying on this attribute b e c o m es m o re
d iffic ult;
• Of the three solutions, the first is generally considered best b e c a u s e it
does not suffer from redundancy a n d it is completely general, having
no limit p l a c e d on a maximum number of values
Second Normal Form
 Second normal form (2NF) is b a s e d on the c o n c e p t of full
functional d e p e n d e n c y
 A functional d e p e n d e n c y X → Y is a full functional
dependency if removal of a n y attribute A from X means
that the d e p e n d e n c y does not hold a n y more; that is,
for an y attribute A ε X, (X – {A}) does not functionally
determine Y
 A functional d e p e n d e n c y X→Y is a partial dependency
if some attribute A ε X c a n b e removed from X a n d the
d e p e n d e n c y still holds; that is, for some A ε X, (X – {A}) → Y
 Inthe a b ove figure , {Ssn, Pnum ber} →
Hours is a full dependency (neither Ssn → Hours nor
Pnumber→Hours holds)
 {Ssn, Pnumber}→Ename is partial because Ssn→Ename holds

•Definition. A relation schema R is in 2NF if every nonprime


attribute A in R is fully functionally dependent on the primary
key of R
 The test for 2NF involves testing for functional dependencies

whose left-hand side attributes are part of the primary key


 If the primary key contains a single attribute, the test n e e d not b e

applied at all
 The EM P_PRO J rela tio n is in 1NF b ut is no t in 2NF.

 The nonprime attribute Ename violates 2NF bec a use of FD2, as d o

the nonprime attributes Pname a n d Plocation be c a use of FD3


 The func tio nal d e p enden c ies FD2 and
FD3 m a ke Enam e , Pname, and Plocation partially
dependent on the primary key

• {Ssn, Pnumber} of EMP_PROJ, thus violating the 2NF test.


 If a relation schema is not in 2NF, it c a n b e second normalized.
Relation R can be decomposed into 2NF
 Therefore, the functional dependencies FD1, FD2, a n d FD3
lead to the decomposition of EMP_PROJ into the three
relation schemas EP1,
• EP2, a n d EP3 shown in Figure below, e a c h of which is in
2NF.
Third Normal Form
Transitive functional dependency: A functional d e p e n d e n c y X→Y in a
relation schema R is a transitive dependency if there exists a set of
attribute Z that are neither a primary nor a subset of any key of
R(candidate key) a n d both X  Z a n d Z Y holds
Example:

🠶 SSN  DMGRSSN is a transitive FD sinc e


SSN  DNUMBER a n d DNUMBER  DMGRSSN hold Dnumber is
neither a key itself nor a subset of the key of EMP_DEPT
🠶 SSN  ENAME is non-transitive sinc e there is no set o f a ttrib utes X
where SSN  X a n d X  ENAME
Definition: A relation schema R is in third normal form (3NF) if it is in 2NF and
no non-prime attribute A in R is transitively dependent on the primary key.

 The relation sch em a EMP_DEPT is in 2NF, since no partial


d ep en d en cies on a key exist. However, EMP_DEPT is not in 3NF
b e c a u s e of the transitive d e p e n d e n c y of Dmgr_ssn (and also
Dname) on Ssn via Dnumber

 We c a n normalize EMP_DEPT by decomposing it into the two


3NF relation schemas ED1 a n d ED2
General Definition of Second and Third Normal Form
• Takes into a c c o u n t all c a n d i d a t e keys of a relation into
account
 Definition of 2NF: A relation s c h e m a R is in second normal form
(2NF) if every nonprime attribute A in R is not partially
d e p e n d e n t on any key of R
•Definition: A relation schema R is in 2NF if every nonprime attribute A
in R is fully functionally dependent on the primary key of R
• Consid er the relatio n sc hem a LOTS whic h desc rib es
parcels of land for sale in various counties of a state
• Suppose there are two candidate keys: Property_id# and
{country_name, Lot#}, that is Lot# are unique within each
county but property_id# are unique across countries for the
entire state
 Based on the two candidate keys Property_id#
a n d {County_name, Lot#}, the functional dependencies FD1
a n d FD2 hold
🠶 FD1: Prop erty_id → { C ounty_na me,Lot#,Area,Pric e ,Ta x_rate }
🠶 FD2:{County_name,Lot#} →{Property_id, Area,Price,Tax_rate}
🠶 FD3: County_name → Tax_rate
🠶 FD4: Area → Price
 We choose Property_id# as the primary key, but no special
consideration will b e given to this key over the other
candidate key
 FD3 says that the tax rate is fixed for a given county (does not
vary lot by lot within the same county)
 FD4 says tha t the p ric e of a lot is
d e termine d by its a rea regardless of whic h c ounty it is in.
 The LOTS relation schema violates the general definition of 2NF
b e c a use Ta x_rate is pa rtia lly d e p enden t on the c a ndid a te key
{County_name, Lot#}, due to FD3
 To normalize LOTS into 2NF, w e d e c o m p o s e it into the two relations
LOTS1 a n d LOTS2

 We construct LOTS1 by removing the attribute Tax_rate that violates


2NF from LOTS a n d placing it with County_name (the left- h an d side
of
FD3 that causes the partial dependency) into another relation
LOTS2.
 Both LOTS1 a n d LOTS2 are in 2NF.
Definition of 3NF: A relation s c h e m a R is in third normal form (3NF) if,
whenever a nontrivial functional d e p e n d e n c y X→A holds in R,
either (a) X is a superkey of R, or (b) A is a prime attribute of R
 A c c ording to this d e finition, LOTS2 is in 3NF
 FD4 in LOTS1 violates 3NF b e c a u s e Area is not a superkey a n d
Price is not a prime attribute in LOTS1
 To normalize LOTS1 into 3NF, w e d e c o m p o s e it into the
relation schemas LOTS1A a n d LOTS1B
 We construct LOTS1A by removing the attribute Price that
violates 3NF from LOTS1 a n d placing it with Area (the lefthand
side of FD4 that causes the transitive d e p e n d e n c y ) into another
relation LOTS1B.

 Both LOTS1A a n d LOTS1B are in 3NF


Boyce-Codd Normal Form
Boyce-Codd normal form (BCNF) was proposed as a simpler
form of 3NF, but it was found to b e stricter than 3NF
 Every relation in BCNF is also in 3NF; however, a relation in
3NF is not n e c essarily in BC NF
 Suppose that w e have thousands of lots in the relation but the
lots are from only two counties: DeKalb a n d Fulton
 Suppose also that lot sizes in DeKalb County are only 0.5,
0.6, 0.7, 0.8, 0.9, a n d 1.0 acres, whereas lot sizes in Fulton
County are restricted to 1.1, 1.2, ..., 1.9, a n d 2.0 acres
 In such a situation we would have the additional
functional d e p e n d e n c y FD5: Area→County_name
🠶 If we a d d this to the other dependencies, the relation
schema LOTS1A still is in 3NF because County_name is a

You might also like