DBMS Relational Model Concepts
Relational Model Concepts
The relational model represents the database as a collection of relations.
Each relation resembles a table of values or a flat file of records
A relation may be a table of values. Then each row in the table represents a
collection of related data values. A row represents a fact that typically corresponds
to a real-world entity or relationship. The table name and column names are used
to interpret the meaning of the values in each row.
Eg:- The table called STUDENT represents facts about a particular student entity.
The column names—Name, Student_number, Class, and Major—specify how to
interpret the data values in each row, based on the column each value is in. All
values in a column are of the same data type.
A row is called a tuple, a column header is called an attribute, and the table is
called a relation. The datatype describing the types of values that can appear in
each column is represented by a domain of possible values.
Domains, Attributes, Tuples, and Relations
A domain D is a set of atomic values. Atomic means that each value in the
domain is indivisible as far as the formal relational model is concerned. A common
method of specifying a domain is to specify a data type from which the data values
forming the domain are drawn. It is also useful to specify a name for the domain,
to help in interpreting its values. Some examples of domains follow:
■ Names: The set of character strings that represent names of persons.
■ Grade_point_averages. Possible values of computed grade point averages; each
must be a real (floating-point) number between 0 and 4.
■ Employee_ages. Possible ages of employees in a company; each must be an
integer value between 15 and 80.
A data type or format is also specified for each domain. For example, the data
type for Employee_ages is an integer number between 15 and 80. A domain is thus
given a name, data type, and format.Additional information for interpreting the
values of a domain can also be given; for example, a numeric domain such as
Person_weights should have the units of measurement, such as pounds or
kilograms.
A relation schema R, denoted by R(A1, A2, ...,An), is made up of a relation name
R and a list of attributes, A1, A2, ..., An. Each attribute Ai is the name of a role
played by some domain D in the relation schema R. D is called the domain of Ai
and is denoted by dom(Ai). A relation schema is used to describe a relation; R is
called the name of this relation. The degree (or arity) of a relation is the number of
attributes n of its relation schema.
A relation of degree seven, which stores information about university students,
would contain seven attributes describing each student. as follows:
CASP Page 1
DBMS Relational Model Concepts
STUDENT(Name, Ssn, Home_phone, Address, Office_phone, Age, Gpa) .
Using the data type of each attribute, the definition is sometimes written as:
STUDENT(Name: string, Ssn: string, Home_phone: string, Address: string,
Office_phone: string, Age: integer, Gpa: real)
For this relation schema, STUDENT is the name of the relation, which has seven
attributes.
A relation (or relation state) r of the relation schema R(A1, A2, ..., An), also
denoted by r(R), is a set of n-tuples r = {t1, t2, ..., tm}. Each n-tuple t is an ordered
list of n values t =<v1, v2, ..., vn>, where each value vi, 1 ≤ i ≤ n, is an element of
dom (Ai) or is a special NULL value.The ith value in tuple t, which corresponds to
the attribute Ai, is referred to as t[Ai] or t.Ai (or t[i] if we use the positional
notation). The terms relation intension for the schema R and relation extension for
a relation state r(R) are also commonly used.
We display the relation as a table, where each tuple is shown as a row and each
attribute corresponds to a column header indicating a role or interpretation of the
values in that column. NULL values represent attributes whose values are
unknown or do not exist for some individual tuple.
A relation (or relation state) r(R) is a mathematical relation of degree n on the
domains dom(A1), dom(A2), ..., dom(An), which is a subset of the Cartesian
product (denoted by ×) of the domains that define R:
r(R) ⊆ (dom(A1) × dom(A2) × ... × dom(An))
We denote the total number of values, or cardinality, in a domain D by |D| .The
total number of tuples in the Cartesian product is
|dom(A1)| × |dom(A2)| × ... × |dom(An)|
This product of cardinalities of all domains represents the total number of possible
instances or tuples that can ever exist in any relation state r(R). Of all these
possible combinations, a relation state at a given time—the current relation state—
reflects only the valid tuples that represent a particular state of the real world.
It is possible for several attributes to have the same domain. The attribute names
indicate different roles, or interpretations, for the domain. For example, in the
STUDENT relation, the same domain phone_numbers plays the role of
Home_phone, referring to the home phone of a student, and the role of
Office_phone, referring to the office phone of the student.
Characteristics of Relations
CASP Page 2
DBMS Relational Model Concepts
1. Ordering of Tuples in a Relation. A relation is defined as a set of tuples.
Mathematically, elements of a set have no order among them; hence, tuples
in a relation do not have any particular order.However, in a file, records are
physically stored on disk (or in memory), so there always is an order among
the records. This ordering indicates first, second, ith, and last records in the
file. Similarly, when we display a relation as a table, the rows are displayed
in a certain order. Tuple ordering is not part of a relation definition . Many
tuple orders can be specified on the same relation. For example, tuples in the
STUDENT relation could be ordered by values of Name, Ssn, Age, or some
other attribute. The definition of a relation does not specify any order: There
is no preference for one ordering over another.
2. Ordering of Values within a Tuple and an Alternative Definition of a
Relation.
An n-tuple is an ordered list of n values, so the ordering of values in a
tuple—and hence of attributes in a relation schema—is important. But at a
more abstract level, the order of attributes and their values is not that
important if the correspondence between attributes and values is
maintained.In a relation schema R = {A1, A2, ..., An} is a set of attributes
(instead of a list), and a relation state r(R) is a finite set of mappings r = {t1,
t2, ..., tm}, where each tuple ti is a mapping from R to D, and D is the union
of the attribute domains.A tuple can be considered as a set of (<attribute>,
<value>) pairs, where each pair gives the value of the mapping from an
attribute Ai to a value vi from dom(Ai). The ordering of attributes is not
important, because the attribute name appears with its value.
3. Values and NULLs in the Tuples. Each value in a tuple is an atomic value;
Composite and multivalued attributes are not allowed. This model is
sometimes called the flat relational model.An important concept is that of
NULL values, which are used to represent the values of attributes that may
be unknown or may not apply to a tuple. A special value, called NULL, is
used in these cases. We can have several meanings for NULL values, such
as value unknown, value exists but is not available, or attribute does not
apply to this tuple also known as value undefined
4. Interpretation (Meaning) of a Relation. The relation schema can be
interpreted as a declaration or a type of assertion. For example, the schema
of the STUDENT relation asserts that, in general, a student entity has a
Name, Ssn, Home_phone, Address, Office_phone, Age, and Gpa. Each tuple
in the relation can then be interpreted as a fact or a particular instance of the
assertion.
Notice that some relations may represent facts about entities, whereas other
relations may represent facts about relationships.
CASP Page 3
DBMS Relational Model Concepts
Relational Model Notation
We will use the following notation in our presentation:
■ A relation schema R of degree n is denoted by R(A1, A2, ..., An).
■ The uppercase letters Q, R, S denote relation names.
■ The lowercase letters q, r, s denote relation states.
■ The letters t, u, v denote tuples.
■ In general, the name of a relation schema such as STUDENT also indicates the
current set of tuples in that relation—the current relation state—whereas
STUDENT(Name, Ssn, ...) refers only to the relation schema.
■ An attribute A can be qualified with the relation name R to which it belongs by
using the dot notation R.A—for example, STUDENT.Name or STUDENT.Age.
This is because the same name may be used for two attributes in different relations.
■ An n-tuple t in a relation r(R) is denoted by t = <v1, v2, ..., vn>, where vi is the
value corresponding to attribute Ai.
Relational Model Constraints and Relational Database Schemas
There are generally many restrictions or constraints on the actual values in a
database state. These constraints are derived from the rules in the mini-world that
the database represents.
Constraints on databases can generally be divided into three main categories:
1. Constraints that are present in the data model. They are called inherent model-
based constraints or implicit constraints.
2. Constraints that can be directly expressed in schemas of the data model,
typically by specifying them in the DDL called schema-based constraints or
explicit constraints.
3. Constraints that cannot be directly expressed in the schemas of the data model,
and hence must be expressed and enforced by the application programs called
application-based or semantic constraints or business rules.
For example, the constraint that a relation cannot have duplicate tuples is an
inherent constraint. The second category of constraints that can be expressed in
the schema of the relational model via the DDL. Constraints in the third category
are more general, relate to the meaning as well as behavior of attributes, and are
difficult to express and enforce within the data model, so they are usually checked
within the application programs that perform database updates.
Data dependencies, which include functional dependencies and multivalued
dependencies are also important constraints.
CASP Page 4
DBMS Relational Model Concepts
The schema-based constraints include domain constraints, key constraints,
constraints on NULLs, entity integrity constraints, and referential integrity
constraints.
Domain Constraints
Domain constraints specify that within each tuple, the value of each attribute A
must be an atomic value from the domain dom(A). The data types associated with
domains typically include standard numeric data types for integers (such as short
integer, integer, and long integer) and real numbers (float and double precision
float). Characters, Booleans, fixed-length strings, and variable-length strings are
also available, as are date, time, timestamp, and money, or other special data types.
Key Constraints and Constraints on NULL Values
A relation is defined as a set of tuples. By definition, all elements of a set are
distinct; hence, all tuples in a relation must also be distinct.This means that no two
tuples can have the same combination of values for all their attributes. Usually,
there are other subsets of attributes of a relation schema R with the property that no
two tuples in any relation state r of R should have the same combination of values
for these attributes. Suppose that we denote one such subset of attributes by SK;
then for any two distinct tuples t1 and t2 in a relation state r of R, we have the
constraint that:
t1[SK]≠ t2[SK]
Any such set of attributes SK is called a superkey of the relation schema R.
A superkey SK specifies a uniqueness constraint that no two distinct tuples in any
state r of R can have the same value for SK. Every relation has at least one default
superkey—the set of all its attributes. A superkey can have redundant attributes,
however, so a more useful concept is that of a key, which has no redundancy. A
key K of a relation schema R is a superkey of R with the additional property that
removing any attribute A from K leaves a set of attributes K_ that is not a superkey
of R any more. Hence, a key satisfies two properties:
1. Two distinct tuples in any state of the relation cannot have identical values for
(all) the attributes in the key. This first property also applies to a superkey.
2. It is a minimal superkey— that is, a superkey from which we cannot remove any
attributes and still have the uniqueness constraint in condition 1 hold. This property
is not required by a superkey.
The first property applies to both keys and superkeys, the second property is
required only for keys.Hence, a key is also a superkey but not vice versa. Consider
the STUDENT relation . The attribute set {Ssn} is a key of STUDENT because no
two student tuples can have the same value for Ssn. Any set of attributes that
includes Ssn—for example, {Ssn, Name, Age}—is a superkey. However, the
superkey {Ssn, Name, Age} is not a key of STUDENT because removing Name or
CASP Page 5
DBMS Relational Model Concepts
Age or both from the set still leaves us with a superkey. In general, any superkey
formed from a single attribute is also a key. A key with multiple attributes must
require all its attributes together to have the uniqueness property.
The value of a key attribute can be used to identify uniquely each tuple in the
relation.
A relation schema may have more than one key. In this case, each of the keys is
called a candidate key. For example, the CAR relation has two candidate keys:
License_number and Engine_serial_number.
It is common to designate one of the candidate keys as the primary key of the
relation. This is the candidate key whose values are used to identify tuples in the
relation.When a relation schema has several candidate keys, the choice of one to
become the primary key is somewhat arbitrary; however, it is usually better to
choose a primary key with a single attribute or a small number of attributes. The
other candidate keys are designated as unique keys, and are not underlined.
Another constraint on attributes specifies whether NULL values are or are not
permitted. For example, if every STUDENT tuple must have a valid, non-NULL
value for the Name attribute, then Name of STUDENT is constrained to be NOT
NULL.
Relational Databases and Relational Database Schemas
A relational database usually contains many relations, with tuples in relations that
are related in various ways.
A relational database schema S is a set of relation schemas S = {R1, R2, ..., Rm}
and a set of integrity constraints IC.
A relational database state DB of S is a set of relation states DB = {r1, r2, ..., rm}
such that each ri is a state of Ri and such that the ri relation states satisfy the
integrity constraints specified in IC. Figure 3.5 shows a relational database schema
that we call COMPANY = {EMPLOYEE, DEPARTMENT, DEPT_LOCATIONS,
PROJECT, WORKS_ON, DEPENDENT}. The underlined attributes represent
primary keys
When we refer to a relational database, we include both its schema and its current
state. A database state that does not obey all the integrity constraints is called an
invalid state, and a state that satisfies all the constraints in the defined set of
integrity constraints IC is called a valid state.
Integrity, Referential Integrity, and Foreign Keys
The entity integrity constraint states that no primary key value can be NULL. This
is because the primary key value is used to identify individual tuples in a relation.
Having NULL values for the primary key implies that we cannot identify some
tuples. For example, if two or more tuples had NULL for their primary keys, we
CASP Page 6
DBMS Relational Model Concepts
may not be able to distinguish them if we try to reference them from other relations.
Key constraints and entity integrity constraints are specified on individual relations.
The referential integrity constraint is specified between two relations and is used
to maintain the consistency among tuples in the two relations. The referential
integrity constraint states that a tuple in one relation that refers to another relation
must refer to an existing tuple in that relation.
For example, the attribute Dno of EMPLOYEE gives the department number for
which each employee works; hence, its value in every EMPLOYEE tuple must
match the Dnumber value of some tuple in the DEPARTMENT relation.
To define referential integrity we define the concept of a foreign key. The
conditions for a foreign key, given below, specify a referential integrity constraint
between the two relation schemas R1 and R2. A set of attributes FK in relation
schema R1 is a foreign key of R1 that references relation R2 if it satisfies the
following rules:
1. The attributes in FK have the same domain(s) as the primary key attributes PK
of R2; the attributes FK are said to reference or refer to the relation R2.
2. A value of FK in a tuple t1 of the current state r1(R1) either occurs as a value of
PK for some tuple t2 in the current state r2(R2) or is NULL.
In the former case, we have t1[FK] = t2[PK], and we say that the tuple t1
references or refers to the tuple t2.
In this definition, R1 is called the referencing relation and R2 is the referenced
relation. If these two conditions hold, a referential integrity constraint from R1 to
R2 is said to hold. Referential integrity constraints typically arise from the
relationships among the entities represented by the relation schemas
CASP Page 7