0% found this document useful (0 votes)
31 views65 pages

Chapter 3

Uploaded by

fakrudinqanani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views65 pages

Chapter 3

Uploaded by

fakrudinqanani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 65

Fundamentals of Database Systems

(CoSc2041)

Chapter Three

Database Models
Chapter Content
 What is model
 Flat-file database vs Relational database
 Hierarchical database model
 Network database model
 Relational database model
 Object Database Model
 Database Development Life Cycle

2
what is a database models?
 A model is a representation of real world objects
and events and their associations.
 It’s a data model that describes in an abstract
way how data is represented in an information
system or a database management system.
 the term model is ambiguous, since it can
cover:
 How the data is organized logically
 How the data is organized, physically
 How data is accessed by the end user

3
Flat-file database Vs Relational database
 There are two main types of database:
 Flat-file database and
 Relational database
 Flat-file database
 are ideal for small amounts of data that needs to be human readable or
edited by hand.
 Essentially all they are made up of is a set of strings in one or more files
that can be parsed to get the information they store great for storing
simple lists and data values, but can get complicated when you try to
replicate more complex data structures.
 One of the main problems with using flat files for even a semi-active
database is the fact that it is very prone to corruption.
 There is no inherent locking mechanism that detects when a file is being
used or modified, and so this has to be done on the script level.

4
Relational Database
 Have a much more logical structure in the way that it stores data
 Tables can be used to represent real world objects, with each fields acting
like an attribute.
 For example, a table called books could have the columns title, author and
ISBN, which describe the details of each book where each row in the table
is a new book.
 One major advantage of the relational model is that, if a database is
designed efficiently, there should be no duplication of any data; helping to
maintain database integrity.
 This can also represent a huge saving in file size, which is important when
dealing with large volumes of data.

5
Database Model
 Database model
 Collection of concepts that describe the structure of a database
 Provides means to achieve data abstraction
 Specify retrievals and updates on the database
 Allows the database designer to specify a set of valid operations
allowed on database objects
 Database model can be divided into four:
 Hierarchical database model
 Network database model
 Relational database model
 Object-oriented database model

6
Hierarchical database model
 Consists of an ordered set of trees in a parent child mode.
 Records are represented by rectangles.
 Allows a node to have only one parent.
 It has two data structure concepts:
 Records and PCR (parent child relationship).
 1-1 or 1-M link is allowed.
 Connection between child and its parent is called a Link.

7
Hierarchical database model…
 Advantages
 The hierarchical model is a relatively simple one – which is its main
strength.
 It is therefore particularly adapted to large, simple databases.
 E.g. Family Tree Problem
 Disadvantage
 We must write a program
 Addition, deletion, and search operations are very difficult.
 There is duplication of data.
 Records are ordered.
 Complex programming is required.

8
CSIT Department

Marc Sarma

Binyam Zecharias Binyam Zecharias

9
Network database model

 It is an extension of the hierarchical model


 To solve the main problem of hierarchical data models, namely the inability
to store N:N relationships, the next generation of data model was developed
and formalized in the late 1960’s
 How does this allow N:N relationships to be stored? Think back to our
example above, with the student that was taught by two teachers – in a
network data model, the data would be stored in the following way, which
avoids the duplication of the “Binyam” and “Zecharias” data:

10
CSIT Department

Marc Sarma

Binyam Zecharias

11
Network database model…
 As in the example above, the graphical representation of a network is done
using a mathematical tool called a direct graph this is constituted of two
types of components:
 Nodes (graphically, circles or boxes), which represent records.
 Edges (graphically, arrows), which represent relationships between the
nodes.
 The network data model is reasonably straightforward once you have
understood the hierarchical one, since it is merely an extension of it. It has,
however, its own specific nomenclature that is different:
 Note that the relationships in this model are still, strictly speaking, 1:N
relationships, not directly N:N. What allows N:N relationships to be stored
all the same is the fact that we allow records to be the members of several
record sets – so, in our example above, there will be two record sets “Sarma
teaches” and “Marc teaches” of which “Binyam” and “Zecharias” will both
be members.

12
Network database model…
 Advantages
 Easy to show the connection of items.
 Good for network type problem.
 Duplication of data is reduced as compared to hierarchical
model.
 Allows M-M relationships.
 Disadvantages
 Complexity problem.
 Addition, deletion, search operations are difficult.
 Programming is required.
13
Relational database model
 The main difference with the previous two models is that this one gets rid (free) of
physical pointers and of all the ensuing limitations.
 This makes the use of an RDBMS (Relational DBMS) much simpler – a user or an
application programmer only needs to understand the logical structure of data, not
how it is physically stored.
 The relational data model, as its predecessors, has its own associated terminology,
which is inherited from the mathematical theory on which it is based.
 The basic guiding principles in a relational data model are:
 Relation: a table with columns and rows.
 Attribute: it is a named column of a relation.
 Domain: is the set of allowable values for one or more attributes.
 Tuple: it is a row of a relation.
 Degree: is the number of attributes in a relation.
 Cardinality: is the number of row in a relation.
 Relational Database: is a collection of normalized relations with distinct
relation name.
 All of this might seem a bit abstract – let’s look at the following concrete example
to make things clearer:

14
Father- Birth- Academic-
Name
Name Year Status

Nebiyu Mulugeta 1982 Warning

Sewit Getachu 1984 Dean's List

Abdellah Oumer 1981 Promoted

Fasika Girma 1982 Promoted

15
Relational database model…
 The spreadsheet data above can be linked to the relational data model
concepts in the following way:
 The entire spreadsheet above is a relation, which we could call
“Student-Status”.
 (Nebiyu, Mulugeta, 1982, Warning), and (Sewit, Getachu, 1984,
Dean’s List) are the first two tuples of this relation. fulfil
 “Name”, “Father-Name”, “Birth-Year”, and “Academic-Status” are
the attributes of the relation.

16
Relational database model…
 This relation complies (fulfil) with the basic rules that govern the relational
data model:
 The order of the rows does not matter – the relation would be the same
if we re-ordered the records in alphabetical order of the name.
 The order of the columns does not matter – if “academic-status” came
before “birth-year”, the relation would still be the same.
 Each row is unique (we will come back to that later on).
 Values in one column are all of the same kind.
 The domains (or data types) for the values in the tuples must be
atomic, which means that:
 Allowed are: elementary data types such as text strings, integers,
dates, etc…
 Not allowed are: tuples, lists, sets, etc…

17
Relational database model…
 Note that modern RDBMS allow some flexibility with this last rule –
however, it is safer to keep it in mind as such, since you cannot commit any
mistakes by applying it.
 Relations in an RDBMS are described using what is called a schema – the
name of the relation, a tuple with its attributes and their respective data
types.
 The schema for our example above could be: Student-Status (Name text
string, Father-Name text string, Birth-Year integer, Academic-Status text
string)
 Strengths & Weaknesses
 It is simple to understand.
 It is highly flexible and easy to use search and access times are relatively slow
– slower, in any case, than in the previous two data models.

18
Object Database Model
 A data model is a logic organization of the real world objects (entities),
constraints on them, and the relationships among objects.
 A core object-oriented data model consists of the following basic object-
oriented concepts:
 Object and object identifier: Any real world entity is uniformly
modeled as an object
 Attributes and methods: every object has a state and a behavior (the
set of methods - program code - which operate on the state of the
object). The state and behavior encapsulated in an object are accessed
or invoked from outside the object only through explicit message
passing.

19
Relational Data Model

20
Relational Data Model Concepts
 It represents the database as a collection of relations.
 Informally, each relation resembles a table of values or, to some extent, a
"flat" file of records.
 When a relation is thought of as a table of values, each row in the table
represents a collection of related data values.
 We will introduce entity types and relationship types as concepts for
modeling real-world data in Chapter 5.
 In the relational model, each row in the table represents a fact that typically
corresponds to a real-world entity or relationship.
 The table name and column names are used to help in interpreting the
meaning of the values in each row.

21
Relational Data Model Concepts …

 The above table is called STUDENT because each row represents facts about a
particular student entity.
 The column names—Name, SSN, Homephone, address, etc.—specify how to
interpret the data values in each row, based on the column each value is in.

22
Relational Data Model Concepts …

 All values in a column are of the same data type.


 In the formal relational model terminology, a row is called
a tuple, a column header is called an attribute, and the
table is called a relation.
 The data type describing the types of values that can
appear in each column is called a domain. now we define
these terms; domain, tuple, attribute, and relation more
precisely.

23
Domains of attributes
 A domain D is a set of atomic values.
 By atomic we mean that each value in the domain is indivisible as far as the relational
model is concerned.
 A common method of specifying a domain is to specify a data type from which the data
values forming the domain are drawn.
 It is also useful to specify a name for the domain, to help in interpreting its values. Some
examples of domains follow:
 Ethiopia_Phone_Numbers: The set of 10-digit phone numbers valid in the
Ethiopia
 Names: The set of Names of persons.
 Employee_age: Possible ages of employees of a campany; each must be a value
between 15 and 80.

24
Logical definitions of domains
 A data type or format is also specified for each domain.
 For example, the data type for the domain Ethiopia_phone_numbers can be
declared as a character string of the form (ddd)ddd-dddd, where each d is a
numeric (decimal) digit and the first three digits form a valid telephone area
code.
 The data type for Employee_ages is an integer number between 15 and 80.
 A domain is thus given a name, data type, and format.
 Additional information for interpreting the values of a domain can also be
given;
 For example, a numeric domain such as Person_weights should have the
units of measurement pounds or kilograms.

25
Logical definitions of domains …
 A relation schema R, denoted by R(A1, A2, . . ., An), is made up of a relation name
R and a list of attributes A1, A2, . . ., An.
 Each attribute Ai is the name of a role played by some domain D in the relation
schema R.
 D is called the domain of Ai and is denoted by dom(Ai). A relation schema is used
to describe a relation; R is called the name of this relation.
 The degree of a relation is the number of attributes n of its relation schema.
 An example of a relation schema for a relation of degree 7, which describes
university students, is the following:
 STUDENT(Name, SSN, HomePhone, Address, OfficePhone, Age, GPA)
 For this relation schema, STUDENT is the name of the relation, which has seven
attributes.
26
Logical definitions of domains …
 We can specify the following previously defined domains for some of the
attributes of the STUDENT relation: dom(Name) = Names; dom(SSN) =
Social_security_numbers; dom(HomePhone) = Local_phone_numbers,
dom(OfficePhone) = Local_phone_numbers, and dom(GPA) =
Grade_point_averages

27
28
Mathematical Definition of Relation
 Consider two sets, D1 & D2, where D1 = {2, 4} and D2 = {1, 3, 5}.
 Cartesian product, D1 ´ D2, is set of all ordered pairs, where first element is
member of D1 and second element is member of D2.
 D1 XD2 = {(2, 1), (2, 3), (2, 5), (4, 1), (4, 3), (4, 5)}
 Alternative way is to find all combinations of elements with first from D1 and
second from D2.
 Any subset of Cartesian product is a relation; e.g. R = {(2, 1), (4, 1)}
 Consider three sets D1, D2, D3 with Cartesian Product D1 X D2 ´XD3; e.g.
 D1 = {1, 3} D2 = {2, 4} D3 = {5, 6}
 D1 ´XD2 ´XD3 = {(1,2,5), (1,2,6), (1,4,5), (1,4,6), (3,2,5), (3,2,6), (3,4,5),
(3,4,6)}

29
Mathematical Definition of Relation…
 Any subset of these ordered triples is a relation.

 The Cartesian product of n sets (D1, D2, . . ., Dn) is:

 D1 XD2 ´X . . . ´X Dn = {(d1, d2, . . ., dn) | d1 ∑D1, d2 ∑D2, . . ., dn∑ Dn}


usually written as:
 n

 XDi

 i=1

 Any set of n-tuples from this Cartesian product is a relation on the n


sets

30
Characteristics of Relations
 The earlier definition of relations implies certain characteristics that make a
relation different from a file.
 Ordering of Tuples in a Relation
 Tuples in a relation don’t have any particular order
 But when we display a relation as a table, the rows are displayed in
certain order.
 Tuple ordering is not part of a relation definition, because a relation
attempts to represent facts at a logical or abstract level and many
logical orders can be specified on a relation. (Eg. Tuples in a student
table can be ordered by Name, SSN or by AGE or by some other
attributes )
 The definition of a relation does not specify any order, and there is
no preference for one logical ordering over another.

31
Characteristics of Relations…
 Ordering of Values within a Tuple
 An n-tuple is an ordered list of n values, so the ordering of values in a
tuple- and hence of attributes in a relation schema- is important.
 Actually at logical level, the order of attributes and their values is not
that important as long as the correspondence between attributes and
values is maintained.(alternative definition if possible)
 Values and Nulls in the tuples
 Each value in a tuple is an atomic value; that is, it is not divisible within the
framework of the basic relational model.
 Composite and multi valued attributes are not allowed.
 Multivalued attributes are represented by separate relations, and composite
attributes are represented only by their simple component attributes in the basic
relational model.
 Null is used to represent the values of attributes that may be unknown or may not
apply to a tuple.

32
Interpretation (Meaning) of a Relation
 The relational schema can be interpreted as a declaration of assertion.
 Example
 The schema of STUDENT relation asserts that, in general, a student
entity has a Name, SSN, HomePhone, Address, Age, and GPA.
 Each tuple in the relation can then be interpreted as a fact or a particular
instance of the assertion.
 Example
 The first tuple in the relation STUDENT asserts the fact that there is a
student whose name is Benjamin Bayer, SSN is 305-61-2435, Age is 19
and so on.
 Some relations may represent facts about entities, whereas other relations
may represent facts about relationships. The relational model represents
facts about both entities and relationships uniformly as relations.

33
Relation Schema
 Consists of relation name, and a set of attributes or field names or
column names. Each attribute has an associated domain.

34
Keys for a Relation
 Key : A set of attributes K, whose values uniquely identify a tuple in any
instance. And none of the proper subsets of K has this property
 A key can not be determined from any particular instance data
 it is an basic property of a scheme
 it can only be determined from the meaning of attributes
 A relation can have more than one key.
 Each of the keys is called a candidate key
 Example: book (isbnNo, authorName, title, publisher, year )
 (Assumption : books have only one author )
 Keys : {isbnNo }, {authorName, title }
 Primary Key : One of the candidate keys chosen for indexing purposes

35
Relational Model Constraints
 Describes the various restrictions on data that can be specified on a
relational database schema in the form of constraints.
 These include domain constraints, key constraints, entity integrity, and
referential integrity constraints.
 Domain constraint
 Domain constraints specify that the value of each attribute A must be an
atomic value from the domain dom(A).
 The data types associated with domains typically include standard numeric
data types for integers (such as short-integer, integer, long-integer) and
real numbers (float and double-precision float).
 Characters, fixed-length strings, and variable-length strings are also
available, as are date, time, timestamp, and money data types.
 Other possible domains may be described by a sub range of values from a
data type or as an enumerated data type where all possible values are
explicitly listed. Rather than describe these in detail.

36
Relational Model Constraints…
 Key Constraints
 A relation is defined as a set of tuples.
 By definition, all elements of a set are distinct; hence, all tuples in a
relation must also be distinct.
 This means that no two tuples can have the same combination of values
for all their attributes.
 Usually, there are other subsets of attributes of a relation schema R
with the property that no two tuples in any relation state r of R should
have the same combination of values for these attributes.
 Suppose that we denote one such subset of attributes by PK.

37
Relational Model Constraints…
 Entity Integrity, Referential Integrity, and Foreign Keys Constraints
 The entity integrity constraint states that no primary key value can be null.
 This is because the primary key value is used to identify individual tuples in a
relation; having null values for the primary key implies that we cannot identify
some tuples.
 For example, if two or more tuples had null for their primary keys, we might not
be able to distinguish them.

 The referential integrity constraint is specified between two relations and is


used to maintain the consistency among tuples of the two relations.
 Informally, states that a tuple in one relation that refers to another relation must
refer to an existing tuple in that relation.
 For example, the attribute DNO of EMPLOYEE gives the department number
for which each employee works; hence, its value in every EMPLOYEE tuple
must match the DNUMBER value of some tuple in the DEPARTMENT relation.

38
Relational Model Constraints ….
 To define referential integrity more formally, we first define the concept of
a foreign key.
 The conditions for a foreign key, given below, specify a referential integrity
constraint between the two relation schemas R1 and R2.
 A set of attributes FK in relation schema R1 is a foreign key of R1 that
references relation R2 if it satisfies the following two rules:
 The attributes in FK have the same domain(s) as the primary key
attributes PK of R2; the attributes FK are said to reference or refer to
the relation R2.
 A value of FK in a tuple t1 of the current state r1(R1) either occurs as a
value of PK for some tuple t2 in the current state r2(R2) or is null. In the
former case, we have t1[FK] = t2[PK], and we say that the tuple t1
references or refers to the tuple t2. R1 is called the referencing
relation and R2 is the referenced relation.
39
Conceptual Database Design, E-R
Modeling

40
Overview
Database Design is the process of coming up with different kinds of
specification for the data to be stored in the database.
Is one of the middle phases we have in information systems
development where the system uses a database approach.
Design is the part on which we would be engaged to describe how
the data should be perceived at different levels and finally how it is
going to be stored in a computer system.

41
Levels of Database Design
The prime interest of a database development system will be the
Design part which is again sub divided into other three sub-phases.

42
Conceptual Design
 The process of constructing a model of the information used in an
enterprise, independent of all physical considerations.
 It is the source of information for the logical design phase.
 The important activities are to identify
 Entities
 Attributes
 Relationships
 Constraints
 And based on these components develop the ER model using ER
diagrams to describe the data

43
Logical Design
 The process of constructing a model of the information used in
an enterprise based on a specific data model (e.g. relational),
but independent of a particular DBMS and other physical
considerations.
 Activities are:
 Normalization process
 Collection of Rules to be maintained
 Discover new entities in the process
 Revise attributes based on the rules and the discovered Entities

44
Physical Design
 The process of producing a description of the implementation of the database on
secondary storage.
 Describes the storage structures and access methods used to achieve efficient
access to the data.
 Tailored to a specific DBMS system- Characteristics are function of DBMS and
operating systems
 Includes estimate of storage space
 Activities are specifications of:
 internal storage structure
 Indexes
 Access paths
 File organizations of the databases
45
Entity-Relationship (ER) Model

The entity-relationship (ER) data model allows us to describe the data involved in
a real-world enterprise in terms of objects and their relationships and is widely used
to develop an initial database design.
The ER model is important primarily for its role in database design.
It provides useful concepts that allow us to move from an informal description of
what users want from their database to a more detailed and precise, description that
can be implemented in a DBMS.

46
Main Components of ER
 Entity Type or Set:
 An entity type is any type of object that we wish to store data about
 They are specific objects or things in the mini-world that are represented
in the database
 Attribute:
 is some quality about the entities that we are interested in and want to
hold on the database
 These are behaviors used to describe an entity
 Relationships:
 are communications among entities
 Constraints

47
An Entity type
 Is a person, place, thing, event, or concept about which data is collected
 Each entity type shown in the conceptual model represents the entire class
for that entity
 Corresponds to entire table, not row
 The name given to an entity type should always be singular noun and
descriptive
 An entity is one individual/ member/ instance within an entity type
 For example, within the entity type STUDENT, Abebe might be one entity

48
Attributes
 Represents the property used to describe an entity or a
relationship
 An attribute is a descriptor whose values are associated with
individual entities of a specific type
 Each entity within the entity type will have the same set and
number of attributes, but in general different attribute values.
 We store the same type of facts (attributes) about every entity
within the entity type

49
Types of Attributes

 Simple
 Each entity has a single atomic value for the attribute. For example, ID,
Gender, etc.
 Composite
 The attribute may be composed of several components. For example:
 Address(Apt#, House#, Street, City, State, ZipCode, Country), or
 Name(FirstName, MiddleName, LastName).
 Telephone (home, mobile, office)
 Composition may form a hierarchy where some components are
themselves composite.
 Multi-valued
 An entity may have multiple values for that attribute. For example, Color
of a CAR or PreviousDegrees of a STUDENT or DependentName.
 Denoted as {Color} or {PreviousDegrees} or {DependentName}.

50
 Derived
 The value is derived (computed) from the values of other
attributes.
 Age (current year – year of birth)
 Profit (earning-cost)
 GPA (grade point/credit hours)
 Null Values
 attributes which are not applicable or which do not have
values.

51
 In general, composite and multi-valued attributes may be
nested arbitrarily to any number of levels, although this is
rare.
 For example, PreviousDegrees of a STUDENT is a composite
multi-valued attribute denoted by {PreviousDegrees (College, Year,
Degree, Field)}
 Multiple PreviousDegrees values can exist
 Each has four subcomponent attributes:
 College, Year, Degree, Field

52
Example of a composite attribute

53
Relationships
In any business processing one object may be associated with another object due
to some event. Such kind of association is what we call a RELATIONSHIP between
entity objects.
• One external event or process may affect several related entities.
• Related entities require setting of LINKS from one part of the database to
another.
• A relationship should be named by a word or phrase which explains its
function.
• Role names are different from the names of entities forming the relationship:
one entity may take on many roles, the same role may be played by different
entities.
• For each RELATIONSHIP, one can talk about the Number of Entities and
the Number of Tuples participating in the association. These two concepts
are called DEGREE and CARDINALITY of a relationship respectively.

54
Degree of Relationships
An important point about a relationship is how many entities participate in
it. The number of entities participating in a relationship is called the
DEGREE of the relationship.
Among the Degrees of relationship, the following are the basic:
 UNARY/RECURSIVE RELATIONSHIP: Tuples/records of a Single
entity are related with each other. E.g. prerequisite
 BINARY RELATIONSHIPS: Tuples/records of two entities are
associated in a relationship. E.g. Employee and Department
 TERNARY RELATIONSHIP: Tuples/records of three different entities
are associated. E.g. Customer, Branch and Loan
 N-NARY RELATIONSHIP: Tuples from arbitrary number of entity sets
55
are participating in a relationship.
Cardinality of Relationship
Another important concept about relationship is the number of
instances/tuples that can be associated with a single instance from one entity
in a single relationship.
The number of instances participating or associated with a single instance
from an entity in a relationship is called the CARDINALITY of the
relationship.
The major cardinalities of a relationship are:
 ONE-TO-ONE: one tuple is associated with only one other tuple.
 E.g. Building – Location as a single building will be located in a
single location and as a single location will only accommodate a

56
single Building.
ONE-TO-MANY: one tuple can be associated with many other tuples.
 E.g. Department-Student: as one department can have multiple
students.
MANY-TO-ONE: many tuples are associated with one tuple.
 E.g. Employee – Department: as many employees belong to a single
department.
MANY-TO-MANY: one tuple is associated with many other tuples and
from the other side, with a different role name one tuple will be associated
with many tuples.
 E.g. Student – Course: as a student can take many courses and a single

57 course can be attended by many students.


Entity-Relationship Diagrams
 A good entity-relationship design does not contain
redundant attributes.
 Symbols to represent ER components

58
Entity-Relationship Diagrams…
 Total and partial participation
 The participation of an entity set E in a relationship set R is
said to be total if every entity in E participates in at least
one relationship in R.
 If only some entities in E participate in relationships in R,
the participation of entity set E in relationship R is said to
be partial.
 Weak Entity
 An entity set that does not have sufficient attributes to
form a primary key is termed a weak entity set.
 An entity set that has a primary key is termed a strong
entity set.
 For a weak entity set to be meaningful, it must be
associated with another entity set, called the identifying
or owner entity set.
59
Example ER Diagram
 Consider our university example, we list the entity
sets and their attributes below, with primary keys
underlined:

60
Example ER Diagram…

61
Example ER Diagram…
 E-R diagram can express the overall logical structure of a
database graphically.
 E-R diagrams are simple and clear qualities that may well
account in large part for the widespread use of the E-R
model.
 E-R diagram corresponding to instructors and students.

 E-R diagram with an attribute attached to a relationship


set.

62
Example ER Diagram…
 To distinguish among these types, we draw either a
directed line (→) or an undirected line ( —) between
the relationship set and the entity set.
 Relationships. (a) One-to-one. (b) One-to-many. (c)
Many-to-many.

63
Example ER Diagram…
 The discriminator of a weak entity is underlined with a
dashed, rather than a solid, line.
 The relationship set connecting the weak entity set to
the identifying strong entity set is depicted by a double
diamond.

E-R diagram with a weak entity set

64
E-R diagram for a university enterprise

65

You might also like