CMSC 424 Database design Lecture 3: Entity-Relationship Model Book: Chap.
1 and 6 Mihai Pop
Database Design Steps
Entity-relationship Model Typically used for conceptual database design
info
Conceptual DB design
Three Levels of Modeling
Conceptual Data Model
Logical DB design
Logical Data Model
Relational Model Typically used for logical database design
Physical DB design
Physical Data Model
Problems in our First Design
does not capture the fact that a library carries books of a specific author does not capture the fact that a library carries a specific book does not capture the fact that an author has written a specific book does not store which edition of the book the library has, how many copies, etc.
SS# NAME TEL BDATE AUTHOR
WROTE-ON
SNAME
ISBN SUBJECT INDEX BOOK TITLE
CARRY
LNAME
LIBRARY
2nd Attempt to the Library Design
SS# NAME TEL BDATE AUTHOR
WROTE
ISBN TITLE BOOK
INDEX
SUBJECT
SNAME
QUANTITY IN-STOCK EDITION
LNAME
LIBRARY
Much better
Next: Types of Attributes
Simple vs Composite
Single value per attribute ? E.g. Phone numbers are multi-valued If date-of-birth is present, age can be derived Can help in avoiding redundancy, enforcing constraints etc
Single-valued vs Multi-valued
Derived
Types of Attributes
access-date cust-name number
cust-id
customer
cust-street
has
account
cust-city
balance
Types of Attributes
age
multi-valued (double ellipse) (one can have many phone numbers) access-date derived (dashed ellipse) number cust-name (age can be computed from birth date)
cust-id date-of-birth cust-street
customer
has
account
cust-city
phone no.
balance
Types of Attributes
age cust-name
access-date number
cust-id date-of-birth cust-street phone no.
customer
has
account
month
day
cust-city year
balance
Composite Attribute
Next: Keys
Key = set of attributes that uniquely identifies an entity or a relationship
Entity Keys
date-of-birth cust-name
{cust-id} is a natural primary key Typically, SSN forms a good primary key Try to use a candidate key that rarely changes
e.g. something involving address not a great idea
cust-id age cust-street
customer
cust-city
phone no.
Entity Keys
Superkey
any set of attributes that can distinguish entities a minimal superkey
Candidate key
Cant remove any attribute and preserve key-ness
{cust-id, age} not a candidate key {cust-name, cust-city, cust-street} is assuming cust-name is not unique
Primary key
Candidate key chosen as the key by DBA Underlined in the ER Diagram
Entity Keys
date-of-birth cust-name
{cust-id} is a natural primary key Typically, SSN forms a good primary key Try to use a candidate key that rarely changes
e.g. something involving address not a great idea
cust-id age cust-street
customer
cust-city
phone no.
Relationship Set Keys
What attributes are needed to represent a relationship completely and uniquely ?
Union of primary keys of the entities involved, and relationship attributes
access-date cust-id number
customer
has
account
{cust-id, access-date, account number} describes a relationship completely
Relationship Set Keys
Is {cust-id, access-date, account number} a candidate key ?
No. Attribute access-date can be removed from this set without losing key-ness In fact, union of primary keys of associated entities is always a superkey
access-date cust-id number
customer
has
account
Relationship Set Keys
Is {cust-id, account-number} a candidate key ? Depends
access-date cust-id number
customer
has
account
Relationship Set Keys
Is {cust-id, account-number} a candidate key ? Depends
access-date cust-id number
customer
has
account
If one-to-one relationship, either {cust-id} or {account-number} sufficient
Since a given customer can only have one account, she can only participate in one relationship Ditto account
Relationship Set Keys
Is {cust-id, account-number} a candidate key ? Depends
access-date cust-id number
customer
has
account
If one-to-many relationship (as shown), {account-number} is a candidate key
A given customer can have many accounts, but at most one account holder per account allowed
Relationship Set Keys
General rule for binary relationships
one-to-one: primary key of either entity set one-to-many: primary key of the entity set on the many side many-to-many: union of primary keys of the associate entity sets More complicated rules
n-ary relationships
What have we been doing Why ? Understanding this is important
Rest are details !! Thats what books/manuals are for.
Next: Recursive Relationships
Sometimes a relationship associates an entity set to itself
Recursive Relationships
emp-name
emp-id
manager
employee
worker emp-street
works-for
emp-city
Must be declared with roles
Next: Weak Entity Sets
An entity set without enough attributes to have a primary key E.g. Transaction Entity
Attributes:
transaction-number, transaction-date, transaction-amount, transaction-type transaction-number: may not be unique across accounts
Weak Entity Sets
A weak entity set must be associated with an identifying or owner entity set Account is the owner entity set for Transaction
Still need to be able to distinguish between different weak entities associated with the same strong entity
number trans-date trans-number
Weak Entity Sets
account
has
Transaction
trans-type balance trans-amt
Weak Entity Sets
Discriminator: A set of attributes that can be used for that
number
trans-date trans-number
account
has
Transaction
trans-type balance trans-amt
Weak Entity Sets
Primary key:
Primary key of the associated strong entity discriminator attribute set For Transaction:
{account-number, transaction-number}
Example Design
We will model a university database
Main entities:
Professor Projects Departments Graduate students etc
SSN name
proj-number sponsor
professor
area rank
project
start budget
dept-no name
SSN name
dept
office homepage
grad
age degree
SSN name
proj-number sponsor
professor
area rank
project
start budget
dept-no name
SSN
name
dept
office homepage
grad
age degree
SSN name area rank
proj-number
PI
professor
sponsor
project
start budget
Co-PI
Chair
Appt
Supervises
RA
Time (%)
dept-no name
SSN name
dept
office homepage
Major
advisee
grad
age
advisor
degree
Mentor
SSN name area rank
proj-number
PI
professor
sponsor
project
start budget
Co-PI
Chair
Appt
Supervises
RA
Time (%)
dept-no name
SSN name
dept
office homepage
Major
advisee
grad
age
advisor
degree
Mentor
SSN name area rank
proj-number
PI
professor
sponsor
project
start budget
Co-PI
Chair
Appt
Supervises
RA
Time (%)
dept-no name
SSN name
dept
office homepage
Major
advisee
grad
age
advisor
degree
Mentor
Thoughts
Nothing about actual data
How is it stored ?
No talk about the query languages
How do we access the data ?
Semantic vs Syntactic Data Models
Remember: E/R Model is used for conceptual modeling Many conceptual models have the same properties
They are much more about representing the knowledge than about database storage/querying
Thoughts
Basic design principles
Faithful
Must make sense
Satisfies the application requirements Models the requisite domain knowledge
If not modeled, lost afterwards
Avoid redundancy
Potential for inconsistencies
Go for simplicity
Typically an iterative process that goes back and forth
Design Issues
Entity sets vs attributes
Depends on the semantics of the application Consider telephone (entity: customer, attribute: telephone) or (entity: telephone, entity: customer, relationship: customer_phone)
Entity sets vs Relationship sets
Consider loan (entity or relationship between customer and bank?)
N-ary vs binary relationships
Possible to avoid n-ary relationships, but there are some cases where it is advantageous to use them
It is not an exact science !!