ĐẠI HỌC FPT CẦN THƠ
Chapter 2
Relational Data Model - Part 1
Objectives
1 Understand what is a Data Model
2 Understand what belong to a Data Model
3 Understand what are Relations, Attributes, Tuples, Domains
4 Understand what are Relation instances, Schema, DB schema
55 Understand how to use SQL to define relation schemas
Contents
1 An Overview of Data Models
2 Basics of the Relational Model
3 Defining a Relation Schema in SQL
2.1. An Overview of Data Models
2.1.1. What is a Data Model?
A Data Model is a notation for describing data,
including 3 parts:
Structure of the data: Arrays or objects, for
example
Operations on data: queries, modifications
Constraints: Limitations on the data
2.1.2.Important Data Models
Today, two important data models are:
The relational model, including object-relational
extensions
The semi-structured data model, including XML
and related standards
2.1.3. Overview of
Relational Model
2.1.4. Overview of
Semi-structured Data
Another data model, based on trees rather than
tables or arrays.
The point-of-view today is XML: a way to
represent data by hierarchically nested tagged
elements.
XML file
2.2. Basics of Relational Model
2.2.1. Attributes
The relational model represents data as a
2-dimensional table (called a relation)
For example: CUSTOMERS relation
2.2.1. Attributes
Each row represents a CUSTOMER
Each column represent a property of
CUSTOMER and also called a “attribute”
How many Attributes in the table? What are they?
2.2.1. Attributes
4 Attributes
2.2.2. Schemas
The name of a relation and the set of attributes for
that relation is called the schema for that relation.
What is the schema in the table?
2.2.2. Schemas
The schema for the relation:
CUSTOMERS (Customer ID, Tax ID, Name, Address)
2.2.3. Tuples
A row of a relation is called a tuple (or record)
When we want to write a tuple in isolation, not as
part of a relation, we normally use commas to
separate components
2.2.3. Tuples
A row of a relation is called a tuple (or record)
When we want to write a tuple in isolation, not
as part of a relation, we normally use commas to
separate components
Exp: (1234567890,555-5512222, Munmun, 323
Broadway)
2.2.3. Tuples
A row of a relation is called a tuple (or record)
When we want to write a tuple in isolation, not as
part of a relation, we normally use commas to
separate components
How many Tuples in the table?
What are they?
2.2.4. Domains
The relational model requires that each component of
each tuple must be atomic, that is, it must be of
some elementary type such as INTEGER or STRING
It is not permitted for a value to be a record structure,
set, list, array or any type that can have its values
broken into smaller components
A domain is a particular elementary type of a
attribute
2.2.5. Equivalent representation
of a relation
Relations are sets of tuples, not lists of tuples
So, the order in which the tuples of a relation are
presented is …
A. important
B. not important
2.2.5. Equivalent representation
of a relation
Relations are sets of tuples, not lists of tuples
So, the order in which the tuples of a relation are
presented is not important
2.2.6. Relation instances
A relation about CUSTOMERS is not static but
changing over time:
We want to insert tuples for new CUSTOMER as
these appear
We want to edit existing tuples if we get corrected
information about a CUSTOMER
We want to delete a tuple from the database
A set of tuples for a given relation is called an
instance of that relation
2.2.7. Keys of relations
A set of attributes forms a key for a relation if we
don’t allow 2 tuples in a relation instance to have the
same values in all the attributes of the key
What is the key of relation
CUSTOMERS (Customer ID, Tax ID, Name, Address)?
2.2.7. Keys of relations
The key of relation
CUSTOMER (Customer ID, Tax ID, Name, Address)
is CustomerID or TaxID
2.3. Defining a Relation Schema in SQL
2.3. Defining
a Relation Schema in SQL
2.3.1. Relations in SQL
2.3.2. Data Types
2.3.3. Simple Table Declarations
2.3.4. Modifying Relation Schemas
2.3.5. Default Values
2.3.6. Declaring Keys
Summary 1: Relational Model
? = relation.
? = attributes.
? = ?= tuple
? = name (attributes) + other structure info.,
e.g., keys, other constraints. Example: Beers(name,
manf)
Order of attributes is important or not?
? = set of rows for a relation schema.
? = collection of relation schemas.
Summary 1: Relational Model
Table = relation.
Column headers = attributes.
Row = record = tuple
Relation schema = name (attributes) + other structure info.,
e.g., keys, other constraints. Example: Beers(name,
manf)
Order of attributes is arbitrary, but in practice we need to
assume the order given in the relation schema.
Relation instance is current set of rows for a relation
schema.
Database schema = collection of relation schemas.
Summary 2: Why Relations?
Summary 2: Why Relations?
Very simple model.
Often a good match for the way we think about
our data.
Abstract model that underlies SQL, the most
important language in DBMS’s today.
But SQL uses “bags” while the abstract relational model
is set-oriented.
Exercises
Write SQL queries to create the following
Schemas in SQL Server:
Address(House#, StreetName, City, Province, Country)
Salaries(ID, IDPerson, PersonName, SalaryValue,
Month, Year)
Contacts(ID, ContactName, Address, Tel, eMail)
Write SQL queries to drop the above tables
Tools
MS SQL Server
SQL Server Management Studio
https://www.c-sharpcorner.com/article/step-
by-step-installation-of-microsoft-sql-server-
on-windows-system/
ĐẠI HỌC FPT CẦN THƠ
2.3. Defining
a Relation Schema
in SQL
Objectives
1 Understand relations in SQL
2 Can declare a Simple Table in SQL
Contents
1 Relations in SQL
2 Data types
3 Simple table declarations
4 Modifying relation schemas
“sequel”
2.3.1. Relations in SQL
Stored relations: tables
Views: (not stored but constructed): relations
defined by a computation
Temporary tables: relations are constructed
when executing queries/data modifications, then
thrown away
2.3.2. Data types
Data type Description
CHAR(size) A FIXED length string (can contain letters,
numbers, and special characters).
The size parameter specifies the column
length in characters - can be from 0 to 255.
Default is 1
VARCHAR(size) A VARIABLE length string (can contain letters,
numbers, and special characters).
The size parameter specifies the maximum
column length in characters - can be from 0 to
65535
BINARY(size) Equal to CHAR(), but stores binary byte strings.
The size parameter specifies the column
length in bytes. Default is 1
……
https://www.w3schools.com/sql/sql_datatypes.asp
2.3.2. Data types
Data type Description
INTEGER(size) Equal to INT(size)
FLOAT(size, d) A floating point number. The total number of digits is
specified in size. The number of digits after the
decimal point is specified in the d parameter. This
syntax is deprecated in MySQL 8.0.17, and it will be
removed in future MySQL versions
DOUBLE(size, d) A normal-size floating point number. The total
number of digits is specified in size. The number of
digits after the decimal point is specified in
the d parameter
DECIMAL(size, d) An exact fixed-point number. The total number of
digits is specified in size. The number of digits after
the decimal point is specified in the d parameter. The
maximum number for size is 65. The maximum
number for d is 30. The default value for size is 10.
The default value for d is 0.
… https://www.w3schools.com/sql/sql_datatypes.asp
2.3.3. Simple table declarations
CREATE TABLE MovieStar (
name CHAR(30) PRIMARY KEY,
address VARCHAR(255),
gender CHAR (1) ,
birthdate DATE
Can the key
);
have more
CREATE TABLE MovieStar (
name CHAR(30), than one
address VARCHAR(255), attribute?
gender CHAR(1),
birthdate DATE,
PRIMARY KEY (name)
);
2.3.3. Simple table declarations
CREATE TABLE MovieStar ( The key has
name CHAR(30) PRIMARY KEY, one attribute
address VARCHAR(255),
gender CHAR (1) ,
birthdate DATE
);
CREATE TABLE MovieStar (
name CHAR(30), The key has one or
address VARCHAR(255), more than one
attribute
gender CHAR(1),
birthdate DATE,
PRIMARY KEY (name)
);
ĐẠI HỌC FPT CẦN THƠ
2.4. An Algebraic Query
Language
Objectives
1 Understand why do we need Algebraic Query Language
2 Understand how Algebraic Query Language work
Contents
1 Why do we need Algebraic Query Language
2 An Algebraic Query Language
2.4.1. Why do we need a special
Query Language?
One should ask why we need a new kind of
programming languages for databases?
Won’t conventional languages like C or Java suffice
to ask and answer any computable question about
relations? Because we can represent a tuple of a
relation by struct (in C) or an object (in Java) and
we can represent relations by arrays of these
elements.
The surprising answer is that RA is useful
because it is less powerful than C or Java
2.4.2. What is an Algebra?
An algebra consists of operators and atomic
operands
(x + y).z or ((x + 7)/(y - 3)) + x
Operators? operands? Variables?
Constants?
Relational algebra is another example of an
algebra.
2.4.3. Relational algebra definition
Relational algebra:
Atomic operands:
Variables: relations
Constants: finite relations
Operators:
union, intersection, and difference
remove parts (rows/columns) of a relation
combine the tuples of two relation (with
conditions)
Rename (attributes/relation)
2.4.4. Set Operations
on Relations - Set Union
Sells1 removed the duplicate
bar beer price
Joe's Bud 2.50 Sells1 UNION Sells2?
Joe's Miller 2.75
Sue's Bud 2.50
Sells2
bar beer price
Joe's Bud 2.50
Joe's Miller 2.75
Sue's Miller 3.00
2.4.4. Set Operations
on Relations - Set Union
Sells1 removed the duplicate
bar beer price
Joe's Bud 2.50 Sells1 UNION Sells2
Joe's Miller 2.75 bar beer price
Sue's Bud 2.50 Joe's Bud 2.50
Joe's Miller 2.75
Sells2
Sue's Bud 2.50
bar beer price
Sue's Miller 3.00
Joe's Bud 2.50
Joe's Miller 2.75
Sue's Miller 3.00
2.4.4. Set Operations
on Relations - Set Union
Sells1 Retains the duplicate
bar beer price
Joe's Bud 2.50 Sells1 UNION ALL Sells2?
Joe's Miller 2.75
Sue's Bud 2.50
Sells2
bar beer price
Joe's Bud 2.50
Joe's Miller 2.75
Sue's Miller 3.00
2.4.4. Set Operations
on Relations - Set Union
Sells1 Retains the duplicate
bar beer price
Joe's Bud 2.50 Sells1 UNION ALL Sells2
Joe's Miller 2.75 bar beer price
Sue's Bud 2.50 Joe's Bud 2.50
Joe's Miller 2.75
Sells2
Sue's Bud 2.50
bar beer price
Joe's Bud 2.50
Joe's Bud 2.50
Joe's Miller 2.75
Joe's Miller 2.75
Sue's Miller 3.00 Sue's Miller 3.00
2.4.4. Set Operations
on Relations - Set Intersect
Sells1
bar beer price
Joe's Bud 2.50 Sells1 INTERSET Sells2?
Joe's Miller 2.75
Sue's Bud 2.50
Sells2
bar beer price
Joe's Bud 2.50
Joe's Miller 2.75
Sue's Miller 3.00
2.4.4. Set Operations
on Relations - Set Intersect
Sells1
bar beer price
Joe's Bud 2.50 Sells1 INTERSECT Sells2
Joe's Miller 2.75 bar beer price
Sue's Bud 2.50 Joe's Bud 2.50
Joe's Miller 2.75
Sells2
bar beer price
Joe's Bud 2.50
Joe's Miller 2.75
Sue's Miller 3.00
2.4.4. Set Operations on Relations
- Set Diference/Minus
Sells1 Sells2
bar beer price bar beer price
Joe's Bud 2.50 Joe's Bud 2.50
Joe's Miller 2.75 Joe's Miller 2.75
Sue's Bud 2.50 Sue's Miller 3.00
Sells1 MINUS Sells2
2.4.4. Set Operations on Relations
- Set Diference/Minus
Sells1 Sells2
bar beer price bar beer price
Joe's Bud 2.50 Joe's Bud 2.50
Joe's Miller 2.75 Joe's Miller 2.75
Sue's Bud 2.50 Sue's Miller 3.00
Sells1 MINUS Sells2
bar beer price
Sue's Bud 2.50
2.4.5. Projection
R1 := πL (R2)
L is a list of attributes from the schema of R2.
R1 is constructed by looking at each tuple of R2,
extracting the attributes on list L, in the order
specified, and creating from those components a
tuple for R1.
Eliminates duplicate tuples.
2.4.5. Projection
Relation Sells:
bar beer price
Joe’s Bud 2.50
Joe’s Miller 2.75
Sue’s Bud 2.50
Sue’s Miller 3.00
Prices := πbeer, price(Sells):
2.4.5. Projection
Relation Sells:
bar beer price
Joe’s Bud 2.50
Joe’s Miller 2.75
Sue’s Bud 2.50
Sue’s Miller 3.00
Prices := πbeer, price(Sells):
beer price
Bud 2.50
Miller 2.75
Miller 3.00
Extended Projection
Using the same πL operator, in Extended Projection,
we allow the list L to contain arbitrary expressions
involving attributes:
1. Arithmetic on attributes,
e.g., A+B->C.
1. Duplicate occurrences of the same
attribute.
Extended Projection
R= (A B)
1 2
3 4
πA+B->C, A, A (R) ?
Extended Projection
R= (A B)
1 2
3 4
πA+B->C, A, A (R) = C A1 A2
3 1 1
7 3 3
2.4.6. Selection
❖ R1 := σC (R2)
▪ C is a condition (as in “if” statements) that
refers to attributes of R2.
▪ R1 is all those tuples of R2 that satisfy C.
2.4.6. Selection
Relation Sells:
bar beer price
Joe’s Bud 2.50
Joe’s Miller 2.75
Sue’s Bud 2.50
Sue’s Miller 3.00
JoeMenu := σbar=“Joe’s”(Sells):
2.4.6. Selection
Relation Sells:
bar beer price
Joe’s Bud 2.50
Joe’s Miller 2.75
Sue’s Bud 2.50
Sue’s Miller 3.00
JoeMenu := σbar=“Joe’s”(Sells):
bar beer price
Joe’s Bud 2.50
Joe’s Miller 2.75
2.4.7. Product
or Cartesian product or cross-product
R3 := R1 Χ R2
Pair each tuple t1 of R1 with each tuple t2 of R2.
Concatenation (t1, t2) is a tuple of R3.
Schema of R3 is the attributes of R1 and then R2,
in order.
Beware: attribute A of the same name in R1 and
R2: use R1.A and R2.A.
2.4.7. Product
R3 := R1 Χ R2 ?
R1( A B)
1 2
3 4 1. How many Columns
in the result?
R2( B C) 2. How many Rows in
5 6 the result?
7 8
9 10
2.4.7. Product
R1( A B) R3( A R1.B R2.B C )
1 2 1 2 5 6
3 4 1 2 7 8
1 2 9 10
R2( B C) 3 4 5 6
5 6 3 4 7 8
7 8 3 4 9 10
9 10
2.4.8. Natural Join
R3 := R1 ⋈ R2.
A useful join variant (natural join) connects two
relations by:
Equating attributes of the same name
Projecting out one copy of each pair of equated
attributes.
2.4.8. Natural Join
Sells( bar, beer, price ) Bars( bar, addr )
Joe’s Bud 2.50 Joe’s Maple St.
Joe’s Miller 2.75 Sue’s River Rd.
Sue’s Bud 2.50
Sue’s Coors 3.00
BarInfo := Sells ⋈ Bars
Which attributes have the same name?
2.4.8. Natural Join
Sells( bar, beer, price ) Bars( bar, addr )
Joe’s Bud 2.50 Joe’s Maple St.
Joe’s Miller 2.75 Sue’s River Rd.
Sue’s Bud 2.50
Sue’s Coors 3.00
BarInfo := Sells ⋈ Bars
1. How many Columns in the result?
2. How many Rows in the result?
2.4.8. Natural Join
Sells( bar, beer, price ) Bars( bar, addr )
Joe’s Bud 2.50 Joe’s Maple St.
Joe’s Miller 2.75 Sue’s River Rd.
Sue’s Bud 2.50
Sue’s Coors 3.00
BarInfo := Sells ⋈ Bars
BarInfo( bar, beer, price, addr )
Joe’s Bud 2.50 Maple St.
Joe’s Milller 2.75 Maple St.
Sue’s Bud 2.50 River Rd.
Sue’s Coors 3.00 River Rd.
2.4.9. Theta Join
2.4.9. Theta Join
R S
A B C D
1 1 2 2
1 2 3 2
2 3 4 1
R⋈B>=CS
R S 2.4.9. Theta Join
A B C D
1 1 2 2
1 2 3 2 A B C D
2 3 4 1 1 1 2 2
1 2 2 2
2 3 2 2
R⋈B>=CS 1 1 3 2
1 2 3 2
2 3 3 2
1 1 4 1
1 2 4 1
2 3 4 1
R S 2.4.9. Theta Join
A B C D
1 1 2 2
1 2 3 2 A B C D
2 3 4 1 1 1 2 2
1 2 2 2
2 3 2 2
R⋈B>=CS 1 1 3 2
1 2 3 2
2 3 3 2
1 1 4 1
1 2 4 1
2 3 4 1
R S 2.4.9. Theta Join
A B C D
1 1 2 2
1 2 3 2 A B C D
2 3 4 1 1 1 2 2
1 2 2 2
2 3 2 2
R⋈B>=CS 1 1 3 2
1 2 3 2
2 3 3 2
1 1 4 1
R⋈B>=CS
1 2 4 1
A B C D 2 3 4 1
1 2 2 2
2 3 2 2
2 3 3 2
2.4.10. Renaming
R1 := ρR1(A1,…,An)(R2) makes R1 be a relation
with attributes name A1,…,An and the same
tuples as R2.
Simplified notation: R1(A1,…,An) := R2.
2.4.10. Renaming
Bars( name, addr )
Joe’s Maple St.
Sue’s River Rd.
R(bar, addr) := Bars
2.4.10. Renaming
Bars( name, addr )
Joe’s Maple St.
Sue’s River Rd.
R(bar, addr) := Bars
R( bar, addr )
Joe’s Maple St.
Sue’s River Rd.
2.4.11. Combining Operations
to Form Queries
relational algebra, like all algebras, allows us to form
expressions of arbitrary complexity by applying
operations to the result of other operations (using
parentheses).
It is possible to represent expressions as expression
trees.
2.4.11. Combining Operations
to Form Queries
Movies (title, year, length, genre, studioName,
producerC#)
“What are titles and years of movies made by Fox
that are at least 100 minutes long?”
Answer?
2.4.11. Combining Operations
to Form Queries
Movies (title, year, length, genre, studioName,
producerC#)
“What are titles and years of movies made by Fox
that are at least 100 minutes long?”
Answer?
1. Select those Movies tuples that have length >= 100.
formula?
2.4.11. Combining Operations
to Form Queries
Movies (title, year, length, genre, studioName,
producerC#)
“What are titles and years of movies made by Fox
that are at least 100 minutes long?”
Answer?
1. Select those Movies tuples that have length >= 100.
formula? σlength >= 100
2.4.11. Combining Operations
to Form Queries
Movies (title, year, length, genre, studioName,
producerC#)
“What are titles and years of movies made by Fox that are
at least 100 minutes long?”
Answer?
1. Select those Movies tuples that have length >= 100.
σlength >= 100
2. Select those Movies tuples that have studioName = 'Fox'.
2.4.11. Combining Operations
to Form Queries
Movies (title, year, length, genre, studioName,
producerC#)
“What are titles and years of movies made by Fox that
are at least 100 minutes long?”
Answer?
1. Select those Movies tuples that have length >= 100.
σlength >= 100
2. Select those Movies tuples that have studioName =
'Fox'. formula?
σstudioName = ‘Fox’
2.4.11. Combining Operations
to Form Queries
Movies (title, year, length, genre, studioName,
producerC#)
“What are titles and years of movies made by Fox that
are at least 100 minutes long?”
Answer?
1. σlength >= 100
2. σstudioName = ‘Fox’
3. Compute the intersection of (1) and (2).
formula?
2.4.11. Combining Operations
to Form Queries
Movies (title, year, length, genre, studioName,
producerC#)
“What are titles and years of movies made by Fox that
are at least 100 minutes long?”
Answer?
1. σlength >= 100
2. σstudioName = ‘Fox’
3. Compute the intersection of (1) and (2).
formula? ∩ or INTERSECT
σlength >= 100 ∩ σstudioName = ‘Fox’
2.4.11. Combining Operations
to Form Queries
Movies (title, year, length, genre, studioName,
producerC#)
“What are titles and years of movies made by Fox that
are at least 100 minutes long?”
Answer?
1. Select those Movies tuples that have length >= 100.
2. Select those Movies tuples that have studioName =
'Fox'.
3. Compute the intersection of (1) and (2).
4. Project the relation from (3) onto attributes title and
year. formula?
2.4.11. Combining Operations
to Form Queries
Movies (title, year, length, genre, studioName,
producerC#)
“What are titles and years of movies made by Fox that
are at least 100 minutes long?”
Answer?
1. Select those Movies tuples that have length >= 100.
2. Select those Movies tuples that have studioName =
'Fox'.
3. Compute the intersection of (1) and (2).
4. Project the relation from (3) onto attributes title and
year. formula?
πtitle, year(σlength >= 100 ∩ σstudioName = ‘Fox’) (Movies)
2.4.11. Combining Operations
to Form Queries
Movies (title, year, length, genre, studioName,
producerC#)
“What are titles and years of movies made by Fox
that are at least 100 minutes long?”
Answer represented as an expression tree?
2.4.11. Combining Operations
to Form Queries
Movies (title, year, length, genre, studioName,
producerC#)
“What are titles and years of movies made by Fox
that are at least 100 minutes long?”
Answer represented as an expression tree?
πtitle, year
∩
σlength >= 100 σstudioName = ‘Fox’
Movies Movies
Summary
❖RA is more useful than C or Java because it is
less powerful.
❖RA is an algebra: its atomic operands are:
▪ Variables that stand for relations
▪ Constants, which are finite relations
❖The six primitive operators of RA are: Selection,
Projection, Product, Union, Difference and
Rename
❖Other operators of RA are: Natural Join, Theta
Join, …
?
62
ĐẠI HỌC FPT CẦN THƠ
2.5. Constraints on Relations
How to express constraints via Relation Algebra
Introduction
We now take up the third important aspect of a data
model: the ability to restrict the data that may be
stored in a database
In this section, we show how to express both key
constraints and “referential-integrity” constraints
A constraint?
Introduction
A constraint is a mechanism that may be used
to limit the values entered into a column.
CONSTRAINTS
NULL CHECK FOREIGN PRIMARY
Attribute-based check
Tuple-based check
Introduction
2 ways to express constraints with RA
Express Foreign Key Constraints
Express Primary Key Constraints
Express Check Constraint
2.5.1. Relation Algebra
as a constraint Language
2.5.2. Referential integrity Constraints
2.5.2. Referential integrity Constraints
2.5.2. Referential integrity Constraints
2.5.2. Referential integrity Constraints
2.5.3. Key Constraints
2.5.3. Key Constraints
Exp: Foreign Key Constraints
Sells( bar, beer, price ) Bars( bar, addr )
Joe’s Bud 2.50 Joe’s Maple St.
Joe’s Miller 2.75 Sue’s River Rd.
Sue’s Bud 2.50
Sue’s Coors 3.00
1. Which one refers to which?
2. Express this constraint by the set-containment.
2.5.3. Key Constraints
Exp: Foreign Key Constraints
Sells( bar, beer, price ) Bars( bar, addr )
Joe’s Bud 2.50 Joe’s Maple St.
Joe’s Miller 2.75 Sue’s River Rd.
Sue’s Bud 2.50
Sue’s Coors 3.00
1. Sells references to Bars
2.5.3. Key Constraints
2.5.3. Key Constraints
Exp: Primary Key
Constraints
1. Which one is
the primary
key?
2. Express this
constraint by
the set-
containment
2.5.3. Key Constraints
Exp: Primary Key
Constraints
1. OrderID is
the primary
key.
2.5.4. Express Check Constraints
3. Exercises (HW)
Given following relational schema:
Product(model, maker, type)
PC(model, speed, ram, hd, price)
Laptop(model, speed, ram, hd, screen, price)
Printer(model, color, type, price)
Use Relational Algebra to express following constraints:
1. A PC with a processor speed less than 3.00 must not sell
for more than $800
2. A laptop with a screen size less than 15.4 inches must
have at least a 120GB hard disk or sell for less than $1000
3. If a laptop has a larger RAM than a PC, then the laptop
must also have a higher price than the PC
4. No manufacturer of PC’s may also make printers
5. a. Higher model, higher price
b. With the same model, higher speed and RAM and HDD,
higher price
ĐẠI HỌC FPT CẦN THƠ