Data Modeling
SYSTEMS ANALYSIS AND DESIGN, 6 T H EDITION
DENNIS, WIXOM, AND ROTH
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 1
Learning Objectives
Explain the rules and style guidelines for creating entity
relationship diagrams (ERDs).
Create an ERD.
Describe the use of a data dictionary and metadata.
Explain how to balance ERDs and data flow diagrams.
Describe the process of normalization.
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 2
Key Definitions
Data model
o A formal way of representing the data that are used and created
by a business system
o Shows the people, places and things about which data is captured
and the relationships among them.
o Logical data model shows the organization of data without indicating
how it is stored, created, or manipulated
o Physical data model shows how the data will actually be stored in
databases or files.
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 3
Key Definitions
Entity Relationship Diagram (ERD)
o A popular way to depict the data model
Normalization is the process analysts use to
validate data models.
Data models should balance with process models
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 4
Why Is Data Modeling Crucial?
Data is a resource to be shared by as many
processes as possible.
Data organization must be flexible and
adaptable to unanticipated business
requirements – and that is the purpose of
data modeling.
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 5
Other Data Modeling Issues…
Data structures and properties are reasonably permanent –
more stable than the processes that use the data.
Typically very similar to the existing system.
Data models are much smaller than process models and are
constructed more rapidly.
Constructing the data model helps analysts and users quickly
reach consensus on business terminology and rules.
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 6
Entity Relationship
Diagrams
WHAT DO ERDS TELL US?
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 7
Reading an ERD
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 8
Using the ERD to Show Business Rules
Business rules are constraints that are followed when the
system is in operation.
ERD symbols can show when one instance of an entity must
exist for an instance of another to exist
o The chemical must exist before chemical requests can be created for
that chemical
o The lawn chemical applicator must exist before chemical requests
can be created by that lawn chemical applicator
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 9
Using the ERD to Show Business Rules,
con’t.
ERD symbols can show when one instance of
an entity can be related to only one or to
many instances of another entity
o One LCA can create many chemical requests; each
chemical request is created by only one LCA
o A chemical can be included on many chemical
requests; a chemical request is for only one
chemical
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 10
Using the ERD to Show Business Rules,
con’t.
ERD symbols show when the existence of an
entity instance is optional for a related entity
instance
o A chemical may exist without being included on
any chemical requests
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 11
An ERD Example
Entities
CUSTOMER has placed ORDER
Customer ID Order ID
Name Order Date
Address Order Total Cost
Telephone
Relationship
Attributes
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 12
Entity
A person, place, event, or thing about which
data is collected
Must be multiple occurrences to be an entity
Person Place Object Event Concept
STUDENT
STUDENT DORM BOOK
BOOK LIBRARY
LIBRARY COURSE
CHECK
CHECK-
OUT
OUT
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 13
CASE Entry for
Entity
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 14
Attributes
Information captured about an entity
Only those used by the organization should be included
in the model
Attribute names are nouns
Sometimes entity name is added at the beginning of the
attribute name for clarity
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 15
CASE Entry for
Attribute
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 16
Attributes – Understand Them!
Composite Attribute
Student ID Address:
STUDENT
Student Last Name Street
City
Student First Name State
Address Zip
Date of Birth
Multi-valued attribute:
Major
May be more than one
major allowed Gender
Grade Point Derived Attribute:
Calculated from other
values
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 17
Identifier Types
One or more attributes can serve as the entity identifier,
uniquely identifying each entity instance
Concatenated identifier consists of several attributes
An identifier may be ‘artificial,’ such as creating an ID
number
Final decision on identifiers may postponed to the Design
Phase
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 18
Identifier Types
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 19
Relationships
Associations between entities
The first entity in the relationship is the parent entity; the
second entity in the relationship is the child entity
Relationships should have active verb names
Relationships go in both directions
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 20
Cardinality
Refers to the number of times instances in one entity can
be related to instances in another entity
o One instance in an entity refers to one and only one instance in
the related entity (1:1)
o One instance in an entity refers to one or more instances in the
related entity (1:N)
o One or more instances in an entity refer to one or more
instances in the related entity (M:N)
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 21
Modality
Refers to whether or not an instance of a child entity can
exist without a related instance in the parent entity
o Not Null - an instance in the related entity must exist for an
instance in another entity to be valid
o Null - no instance in the related entity is necessary for an
instance in another entity to be valid
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 22
CASE Entry for
Relationship
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 23
Binary Relationships
EMPLOYEE is assigned PARKING
PLACE
is assigned to
one-to-one
PRODUCT includes PRODUCT
LINE
is included in
one-to-many
STUDENT registers for COURSE
registers
many-to-many
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 24
Foreign Keys
A relationship implies that instances of one entity are related to
instances of another entity
The primary key of one entity is migrated into the other entity as a
foreign key.
A foreign key is a primary key of one entity that is contributed to
(duplicated in) another entity for the purpose of identifying
instances of a relationship. A foreign key (always in a child entity)
always matches the primary key (in a parent entity).
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 25
Foreign Keys
Parking ID - PK
Emp ID - PK is assigned
EMPLOYEE PARKING Emp ID - FK
Emp Name
is assigned to PLACE Location
Emp Address
one-to-one
Prod Line ID - PK Product ID - PK
Prod Line Descrip PRODUCT
PRODUCT includes PRODUCT Prod Line ID - FK
PRODUCT
LINE
LINE is included in Prod Description
one-to-many
Student ID – PK Course ID – PK
?? FK ?? STUDENT
STUDENT registers for COURSE
COURSE ?? FK ??
Student Name registers Course Name
Student Address Course Descrip
many-to-many
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 26
Creating an ERD
HOW ERDS ARE DEVELOPED
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 27
Overview
Drawing the ERD is an iterative process of trial and
revision
ERDs can become quite complex
Steps in building ERDs…
o Identify the entities
o Add appropriate attributes for each entity
o Draw the relationships that connect associated entities
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 28
Identify the Entities
Identify major categories of information
o If available, check the process models for data stores, external
entities, and data flows
o Check the major inputs and outputs from the use cases
Verify that there is more than one instance of the entity
that occurs in the system
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 29
Add Appropriate Attributes
Identify attributes of the entity that are relevant to the system
under development
o Check the process model repository entries for details on data flows and
data stores
o Check the data requirements of the requirements definition
o Interview knowledgeable users
o Perform document analysis on existing forms and reports
Select the entity’s candidate identifier (final decision may be
postponed until Design phase)
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 30
Draw the Relationships
Start with an entity and identify all entities with which it
shares relationships
Describe the relationship with the appropriate verb
phrase
Determine the cardinality and modality by discussing the
business rules with knowledgeable users
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 31
ERD Building Tips
Data stores of the DFD generally correspond to entities
Only include entities with more than one instance
Don’t include entities associated with implementation of
the system (e.g., archive files of older data). They will be
added later.
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 32
Advanced Syntax - Intersection Entities
A new entity is created to store information about two
entities sharing an M:N relationship
o Remove the M:N relationship between two entities and insert new
entity between them
o Create two 1:N relationships: original entities are parents to the
new child intersection entity
o Name the intersection entity
o Migrate parent entity primary keys to new entity as foreign keys
(possibly also concatenated primary key)
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 33
Resolving M:N with an Intersection Entity
registers for COURSE registers
STUDENT REGIS- COURSE
TRATION
Student ID (PK FK) Course ID - PK
Student ID - PK
Course ID (PK FK) Course Name
Student Name
Semester (PK) Course Descrip
Student Address
Final Grade
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 34
Resolving M:N
Relationship
Additional Example
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 35
Validating an ERD
ENSURING A QUALITY DATA MODEL
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 36
Design Guidelines
Best practices rather than rigid rules
Entities should have many occurrences
Avoid unnecessary attributes
Clearly label all components
Apply correct cardinality and modality
Break attributes into lowest level needed
Labels should reflect common business terms
Assumptions should be clearly stated
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 37
Balancing ERDs with DFDs
All analysis activities are interrelated
Process models contain two data components
o Data flows and data stores
The DFD data components need to balance the ERD’s data stores
(entities) and data elements (attributes)
Many CASE tools provide features to check for imbalance
Check that all data stores and elements correspond between models
o Data that is not used is unnecessary
o Data that has been omitted results in an incomplete system
Do not follow thoughtlessly -- check that the models make sense!
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 38
Use of a CRUD
Matrix
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 39
Normalization
Technique used to validate data models
Series of rules applied to logical data model to improve its
organization
Three normalization rules are common
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 40
Example 1: Unnormalized Entity
ORDER
Begin with an entity from OrderNumber
the logical data model OrderDate
CustomerName
CustomerAddress consisting of:
Street
City
Do any attributes (or groups of attributes) occur State
more than once for a single occurrence of the ZipCode
CustomerType
entity? Initials
District Number
Region Number
1 to 22 Occurrences of:
Yes Item Number
Item Name
Quantity Ordered
Item Unit
Quantity Shipped
Item Price
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 41
Example 1: 1st Normal Form
Do any attributes (or groups of attributes) occur more than once for a single
occurrence of the entity?
If yes, remove the attributes (or groups) into separate entities.
ORDER ORDERED ITEM
OrderNumber Item Number
OrderNumber
includes
OrderDate
is included in
CustomerName Item Name
CustomerAddress consisting of: Quantity Ordered
Street Item Unit
City Quantity Shipped
State Item Price
ZipCode
CustomerType
Initials
District Number
Region Number
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 42
Potential Anomalies with 1st Normal Form
ORDER ORDERED ITEM
OrderNumber Item Number Insert anomaly:
OrderNumber
OrderDate
includes Insert a new Item?
is included in
CustomerName Item Name
CustomerAddress consisting of: Quantity Ordered Can’t do without OrderNumber
Street Item Unit
City Quantity Shipped
State Item Price
ZipCode
CustomerType
Initials
District Number
Deletion anomaly:
Region Number Assume only one order has been made for Item #456. What happens
if that order is cancelled?
Will lose all information about Item # 456
Update anomaly:
The price of Item # 789 should be changed from $7.99 to $8.95. What problem occurs?
Need to search entire database for all occurrences of Item # 789
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 43
Potential Anomalies with 1st Normal Form
ORDER ORDERED ITEM
OrderNumber Item Number
OrderNumber Reason anomalies exist?
includes
OrderDate
is included in
CustomerName Item Name
CustomerAddress consisting of: Quantity Ordered Several non-key attributes depend only on
Street Item Unit ItemNumber and NOT on the full primary
City Quantity Shipped
State Item Price key (ItemNumber + OrderNumber)
ZipCode
CustomerType
Initials
District Number
Region Number
ItemNumber Quantity
ItemName Ordered
Quantity
ItemPrice OrderNumber Shipped
ItemUnit
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 44
Example 1: 2nd Normal Form
For entities with concatenated keys…
Do any attributes depend on just part of the key rather than the entire key?
If yes, move partially-dependent attributes to a new entity…
ORDER ORDERED ITEM ITEM
OrderNumber includes/ Item Number Is included in/ Item Number
is included in OrderNumber includes
OrderDate Item Name
CustomerName Quantity Ordered Item Unit
CustomerAddress consisting of: Quantity Shipped Item Price
Street
City
State
ZipCode
CustomerType
Initials
District Number
Region Number
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 45
Potential Anomalies with 2nd Normal Form
ORDER ORDERED ITEM ITEM
OrderNumber includes/ Item Number Is included in/ Item Number
is included in OrderNumber includes
OrderDate Item Name
CustomerName Quantity Ordered Item Unit
CustomerAddress consisting of: Quantity Shipped Item Price
Street
City
State Reason Anomalies Exist:
ZipCode
CustomerType
Initials The ORDER entity contains transitive dependencies
District Number
Region Number
This means that several non-key attributes depend on another non-key attribute, and NOT
on the Primary Key
CustomerAddress, CustomerType, DistrictNumber, and RegionNumber depend on the
CustomerName, NOT the OrderNumber
Solution: remove these attributes to a separate entity
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 46
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 46
Example 1: 3rd Normal Form
Do any attribute values depend on an attribute that is not the entity’s key?
If yes, move these attributes to a new entity.
CUSTOMER ORDER ORDERED ITEM ITEM
CustomerNumber places/ OrderNumber includes/ Item Number is included in/ Item Number
is placed by is included in OrderNumber includes
CustomerName OrderDate Item Name
CustomerAddress consisting of: CustomerNumber Quantity Ordered Item Unit
Street Initials Quantity Shipped Item Price
City
State
ZipCode
CustomerType
District Number
Region Number
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 47
Summary of
Normalization
Steps
© 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 48