0% found this document useful (0 votes)
37 views51 pages

02 Introduction To Database Management Systems - 2023-4

Database management system book 2

Uploaded by

maarianto04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views51 pages

02 Introduction To Database Management Systems - 2023-4

Database management system book 2

Uploaded by

maarianto04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Introduction to Database Management

Data Models
• Databases must represent:
• the data itself (typically structured in some way)
• associations between different data values
• optionally, constraints on data values
• What kind of data can be modeled?
• What kinds of associations can be represented?
• The data model specifies:
• what data can be stored (and sometimes how it is stored)
• associations between different data values
• what constraints can be enforced
• how to access and manipulate the data

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Relations
• Relations are basically tables of data acct_id branch_name balance

• Each row represents a record in the relation A-301 New York 350
A-307 Seattle 275
• A relational database is a set of relations A-318 Los Angeles 550
• Each relation has a unique name in the database ... ... ...
• Each row in the table specifies a relationship The account relation
between the values in that row
• The account ID “A-307”, branch name “Seattle”, and
balance “275” are all related to each other

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Relations and Attributes
• Each relation has some number of attributes acct_id branch_name balance

• Sometimes called “columns” A-301 New York 350


A-307 Seattle 275
• Each attribute has a domain A-318 Los Angeles 550
• Specifies the set of valid values for the attribute ... ... ...
• The account relation: account
• 3 attributes
• Domain of balance is the set of nonnegative integers
• Domain of branch_name is the set of all valid branch names in the bank

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Tuples and Attributes
• Each row is called a tuple acct_id branch_name balance
• A fixed-size, ordered set of name-value pairs A-301 New York 350
• A tuple variable can refer to any valid tuple in a A-307 Seattle 275
relation A-318 Los Angeles 550
• Each attribute in the tuple has a unique name ... ... ...

• Can also refer to attributes by index account


• Attribute 1 is the first attribute, etc.
• Example:
• Let tuple variable t refer to first tuple in account relation
• t[balance] = 350
• t[2] = “New York”
www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia
Tuples and Relations
• A relation is a set of tuples
• Each tuple appears exactly once
• Note: SQL tables are multisets! (Sometimes called bags.)
• If two tuples t1 and t2 have the same values for all attributes, then t1 and t2 are
the same tuple (i.e. t1 = t2)
• The order of tuples in a relation is not relevant

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Example of a Relation
attributes
(or columns)

tuples
(or rows)

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Attribute Types
• The set of allowed values for each attribute is called the domain of the
attribute
• Attribute values are (normally) required to be atomic; that is,
indivisible
• The special value null is a member of every domain
• Indicated that the value is “unknown”
• The null value causes complications in the definition of many
operations

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Relation Schemas
• Every relation has a schema
• Specifies the type information for relations
• Multiple relations can have the same schema
• A relation schema includes:
• an ordered set of attributes
• the domain of each attribute
• Naming conventions:
• Relation names are written as all lowercase
• Relation schema’s name is capitalized
• For a relation r and relation schema R:
• Write r(R) to indicate that the schema of r is R

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Relation Schema and Instance
• A1, A2, …, An are attributes

• R = (A1, A2, …, An ) is a relation schema


Example:
instructor = (ID, name, dept_name, salary)
• Formally, given sets D1, D2, …. Dn a relation r is a subset of
D1 x D2 x … x Dn
Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai  Di

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Relation Schema and Instance (cont.)
• The current values (relation instance) of a relation are specified by a
table
• An element t of r is a tuple, represented by a row in a table

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Schema of account Relation
• The relation schema of account is: acct_id branch_name balance

Account_schema = (acct_id, branch_name, balance) A-301 New York 350


A-307 Seattle 275
• To indicate that account has schema A-318 Los Angeles 550
Account_schema:
... ... ...
account(Account_schema)
account

• Important note:
• Domains are not stated explicitly in this notation!

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Relation Schemas
• Relation schemas are ordered sets of attributes
• Can use set operations on them
• Examples:
Relations r(R) and s(S)
• Relation r has schema R
• Relation s has schema S
R∩S
• The set of attributes that R and S have in common
R–S
• The set of attributes in R that are not also in S
• (and, the attributes are in the same order as R)
K⊆R
• K is some subset of the attributes in relation schema R
www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia
Attribute Domains
• The relational model constrains attribute domains to be atomic
• Values are indivisible units
• Mainly a simplification
• Virtually all relational database systems provide non-atomic data types
• Attribute domains may also include the null value
• null = the value is unknown or unspecified
• null can often complicate things. Generally considered good practice to avoid
wherever reasonable to do so.

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Relations and Relation Variables
More formally: acct_id branch_name balance

• account is a relation variable A-301 New York 350


A-307 Seattle 275
• A name associated with a specific schema, and a set of
tuples that satisfies that schema A-318 Los Angeles 550
• (sometimes abbreviated “relvar”) ... ... ...

• A specific set of tuples with the same schema is called account


a relation value (sometimes abbreviated “relval”)
• (Formally, this can also be called a relation)
• Can be associated with a relation variable
• Or, can be generated by applying relational operations to
one or more relation variables

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Relations and Relation Variables (2)
• Problem: acct_id branch_name balance

• The term “relation” is often used in slightly different A-301 New York 350
ways A-307 Seattle 275
A-318 Los Angeles 550
• “Relation” usually means the collection of tuples ... ... ...
• i.e. “relation” usually means “relation value”
account
• It is often used less formally to refer to a relation
variable and its associated relation value
• e.g. “the account relation” is really a relation variable
that holds a specific relation value

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Relations are Unordered
• Order of tuples is irrelevant (tuples
may be stored in an arbitrary order)
• Example: instructor relation with
unordered tuples

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Distinguishing Tuples
• Relations are sets of tuples…
• No two tuples can have the same values for all attributes…
• But, some tuples might have the same values for some attributes
• Example: acct_id branch_name balance
• Some accounts have the same balance A-301 New York 350

• Some accounts are at the same branch A-307 Seattle 275


A-318 Los Angeles 550
A-319 New York 80
A-322 Los Angeles 275
account

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Keys
• Keys are used to distinguish individual tuples
• A superkey is a set of attributes that uniquely
identifies tuples in a relation
• Example: acct_id branch_name balance
{acct_id} is a superkey
A-301 New York 350
• Is {acct_id, balance} a superkey? A-307 Seattle 275
• Yes! Every tuple will have a unique set of values for A-318 Los Angeles 550
this combination of attributes.
A-319 New York 80
• Is {branch_name} a superkey? A-322 Los Angeles 275
• No. Each branch can have multiple accounts account

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Superkeys and Candidate Keys
• A superkey is a set of attributes that uniquely identifies tuples in a
relation
• Adding attributes to a superkey produces another superkey
• If {acct_id} is a superkey, so is {acct_id, balance}
• If a set of attributes K ⊆ R is a superkey, so is any superset of K
• Not all superkeys are equally useful…
• A minimal superkey is called a candidate key
• A superkey for which no proper subset is a superkey
• For account, only {acct_id} is a candidate key

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Primary Keys
• A relation might have several candidate keys
• In these cases, one candidate key is chosen as the primary means of
uniquely identifying tuples
• Called a primary key
cust_id branch_name balance
• Example: customer relation 23-652 Joe Smith 330-25-8822
• Candidate keys could be: 15-202 Ellen Jones 221-30-6551
{cust_id} 23-521 Dave Johnson 005-81-2568
{cust_ssn}
... ... ...
• Choose primary key:
customer
{cust_id}

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Primary Keys (2)
• Keys are a property of the relation schema, not individual tuples
• Applies to all tuples in the relation
• Primary key attributes are listed first in relation schema, and are
underlined
• Examples:
Account_schema = (acct_id, branch_name, balance)
Customer_schema = (cust_id, cust_name, cust_ssn)
• Only indicate primary keys in this notation
• Other candidate keys are not specified

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Primary Keys (3)
• Multiple records cannot have the same cust_id branch_name balance

values for a primary key! 23-652 Joe Smith 330-25-8822


15-202 Ellen Jones 221-30-6551
• …or any candidate key, for that matter…
23-521 Dave Johnson 005-81-2568
• Example: 15-202 Albert Stevens 450-22-5869
customer(cust_id, cust_name, cust_ssn) ... ... . . . relation
The customer
• Two customers cannot have the same ID. customer
• This is an example of an invalid relation
• The set of tuples doesn’t satisfy the required
constraints

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Keys Constrain Relations
• Primary keys constrain the set of tuples that can appear in a relation
• Same is true for all superkeys
• For a relation r with schema R
• If K ⊆ R is a superkey then
⟨ ∀t1, t2 ∈ r(R) : t1[K] = t2[K] : t1[R] = t2[R] ⟩
• i.e., if two tuple-variables have the same values for the superkey attributes,
then they refer to the same tuple
• t1[R] = t2[R] is equivalent to saying t1 = t2

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Choosing Candidate Keys
• Since candidate keys constrain the tuples that can be stored in a
relation…
• Attributes that would make good (or bad) candidate keys depend on what is
being modeled
• Example: customer name as candidate key?
• Very likely that multiple people will have same name
• Thus, not a good idea to use it as a candidate key
• These constraints motivated by external requirements
• Need to understand what we are modeling in the database

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Super Key vs. Candidate Key vs. Primary Key
Super Key
• Definition: A super key is an attribute or a set of attributes that can uniquely
identify a record in a database table. A super key may contain additional
attributes that are not necessary for unique identification.
• Uniqueness: Super keys guarantee that no two rows have the same set of
values in the super key columns. However, they are not minimal, meaning
a super key can have extra attributes that aren’t needed to maintain
uniqueness.
• Multiplicity: A table can have multiple super keys, varying in the number of
attributes they include, as long as the combination ensures uniqueness.

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Super Key vs. Candidate Key vs. Primary Key
Candidate Key
• Definition: A candidate key is a minimal super key, meaning it is a super
key with no possible reduction in the number of columns that can still
uniquely identify a record. Essentially, it is the minimal subset of a super
key.
• Uniqueness and Non-nullability: Each candidate key must contain unique
values and cannot contain null values, ensuring that every record within
the table can be uniquely identified by this key.
• Multiplicity: A table can have multiple candidate keys if there are multiple
minimal sets of fields that can uniquely identify a record in the table.

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Super Key vs. Candidate Key vs. Primary Key
Primary Key
• Definition: The primary key is a specific candidate key selected by the
database designer to uniquely identify records in a table. It is the most
important key and is used to reference other tables (foreign keys).
• Uniqueness and Non-nullability: The primary key must consist of unique
values and cannot contain null values. It enforces entity integrity by
uniquely identifying each record in a table.
• Singular Presence: A table can have only one primary key. This can be a
single column (simple primary key) or composed of multiple columns
(composite primary key) to ensure uniqueness across records.

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Foreign Keys
• One relation schema can include the attributes of another schema’s
primary key
• Example: depositor relation
• Depositor_schema = (cust_id, acct_id)
• Associates customers with bank accounts
• cust_id and acct_id are both foreign keys
• cust_id references the primary key of customer
• acct_id references the primary key of account
• depositor is the referencing relation
• It refers to the customer and account relations
• customer and account are the referenced relations
www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia
depositor Relation
cust_id cust_name cust_ssn acct_id branch_name balance
23-652 Joe Smith 330-25-8822 A-301 New York 350
15-202 Ellen Jones 221-30-6551 A-307 Seattle 275
23-521 Dave Johnson 005-81-2568 A-318 Los Angeles 550
cust_id acct_id
... ... ... ... ... ...
15-202 A-301
customer account 23-521 A-307
23-652 A-318
• depositor relation references customer and account ... ...
• Represents relationships between customers and their accounts depositor
• Example: Joe Smith’s accounts
• “Joe Smith” has an account at the “Los Angeles” branch, with a balance of 550.
www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia
Foreign Key Constraints
• Tuples in depositor relation specify values for cust_id
• customer relation must contain a tuple corresponding to each cust_id value in
depositor
• Same is true for acct_id values and account relation
• Valid tuples in a relation are also constrained by foreign key
references
• Called a foreign-key constraint
• Consistency between two dependent relations is called referential
integrity
• Every foreign key value must have a corresponding primary key value
www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia
Foreign Key Constraints (2)
• Given a relation r(R)
• A set of attributes K ⊆ R is the primary key for R
• Another relation s(S) references r
• K ⊆ S too
• ⟨ ∀ts ∈ s : ∃tr ∈ r : ts[K] = tr[K] ⟩
• Notes:
• K is not required to be a candidate key for S, only R
• K may also be part of a larger candidate key for S

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Primary Key of depositor Relation?
• Depositor_schema = (cust_id, acct_id) cust_id acct_id
• If {cust_id} is the primary key: 15-202 A-301
• A customer can only have one account 23-521 A-307
• Each customer’s ID can appear only once in depositor
• An account could be owned by multiple customers 23-652 A-318

• If {acct_id} is the primary key: ... ...


• Each account can be owned by only one customer depositor
• Each account ID can appear only once in depositor
• Customers could own multiple accounts
• If {cust_id, acct_id} is the primary key:
• Customers can own multiple accounts
• Accounts can be owned by multiple customers
• Last option is how most banks really work

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Conceptual Data Model (CDM)
• Model the logical structure of the entire data application, independent of
software or data structure model considerations
• A valid CDM can be converted into a PDM
• In its application, CDM can be equated with ERD
• whose function is indeed the same, namely modeling the logical structure of the
database
• CDM is used to describe in detail the logical structure of the database
• CDM consists of objects that are not directly implemented into the actual
database
www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia
Steps to Create a CDM
• Understand the core of the problem given
• Determine which entities are involved
• Determine the data attributes for each entity along with their data types
• Identify the relationships/associations between each entity including their
cardinalities
• Model the Entities and Relationships
• Verify the accuracy of the model
• Correct any errors and warnings

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Physical Data Model (PDM)
• A physical representation of the database to be created, taking into
account the DBMS to be used
• PDM can be generated from a valid CDM
• In its application, PDM can be equated with the Relation Schema
whose function is to model the physical structure of a database
• It is a detailed representation of a database in physical form
• PDM shows the correct data storage structure in the actual database
used

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Schema Diagram for University Database

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Relational Query Languages
• Procedural vs non-procedural, or declarative
• “Pure” languages:
• Relational algebra
• Tuple relational calculus
• Domain relational calculus
• The above 3 pure languages are equivalent in computing power
• We will concentrate in this chapter on relational algebra
• Not turing-machine equivalent
• consists of 6 basic operations

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Select Operation – selection of rows (tuples)
• Relation r

• (A=B) ^ (D > 5) (r)

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Project Operation – Selection of Columns (Attributes)
• Relation r

• A,C (r)

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Union of Two Relations
• Relations r, s

•rs

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Set Difference of Two Relations
• Relations r, s

•r –s

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Joining Two Relations – Cartesian-product
• Relations r, s

•rxs

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Cartesian-product – Naming Issue
• Relations r, s

•rxs

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Renaming a Table
• Allows us to refer to a relation, (say E) by more than one name
 x (E)
returns the expression E under the name X
• Relations r

• r x  s (r) r.A r.B s.A s.B


α 1 α 1
α 1 β 2
β 2 α 1
β 2 β 2
www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia
Composition of Operations
• Can build expressions using multiple operations
Example: A=C (r x s)
rxs

A=C (r x s)

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Joining two relations – Natural Join
• Let r and s be relations on schemas R and S respectively.
Then, the “natural join” of relations R and S is a relation on schema R  S
obtained as follows:
• Consider each pair of tuples tr from r and ts from s.
• If tr and ts have the same value on each of the attributes in R  S, add a tuple t
to the result, where
• t has the same value as tr on r
• t has the same value as ts on s

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Natural Join Example
• Relations r, s

• Natural Join
r s

 A, r.B, C, r.D, E ( r.B = s.B ˄ r.D = s.D (r x s)))


www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia
Notes about Relational Languages
• Each Query input is a table (or set of tables)
• Each query output is a table.
• All data in the output table appears in one of the input tables
• Relational Algebra is not Turning complete
• Can we compute:
• SUM
• AVG
• MAX
• MIN

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Symbol (Name) Example of Use
σ
(Selection) σ salary > = 85000 (instructor)
Return rows of the input relation that satisfy the predicate.
Π
(Projection) Π ID, salary (instructor)
Output specified attributes from all rows of the input relation. Remove

Summary of Relational
duplicate tuples from the output.
x
instructor x department
Algebra Operators
(Cartesian Product)

Output pairs of rows from the two input relations that have the same value on
all attributes that have the same name.

(Union) Π name (instructor) ∪ Π name (student)

Output the union of tuples from the two input relations.


-
(Set Difference) Π name (instructor) -- Π name (student)

Output the set difference of tuples from the two input relations.

(Natural Join) instructor ⋈ department

Output pairs of rows from the two input relations that have the same value on
all attributes that have the same name.

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia


Thank you!

www.its.ac.id INSTITUT TEKNOLOGI SEPULUH NOPEMBER, Surabaya - Indonesia

You might also like