A2 - Database-The Relational Database Model
A2 - Database-The Relational Database Model
Option A- Database
The relational database model
Shruti Dixit
A.2 The relational database model: Objectives
• Define the terms: database management system (DBMS) and relational database management system (RDBMS).
• Outline the functions and tools of a DBMS.
• Describe how a DBMS can be used to promote data security.
• Define the term schema.
• Identify the characteristics of the three levels of the schema: conceptual, logical, physical.
• Outline the nature of the data dictionary.
• Explain the importance of a data definition language in implementing a data model.
• Explain the importance of data modelling in the design of a database.
• Define the following database terms: table, record, field, primary key, secondary key, foreign key, candidate key,
composite primary key, join.
• Identify the different types of relationships within databases: one-to-one, one-to-many, many-to-many.
• Outline the issues caused by redundant data.
• Outline the importance of referential integrity in a normalized database.
• Describe the differences between 1st Normal Form (1NF), 2nd Normal Form (2NF) and 3rd Normal Form (3NF).
• Describe the characteristics of a normalized database.
• Evaluate the appropriateness of the different data types.
• Construct an entity-relationship diagram (ERD) for a given scenario.
• Construct a relational database to 3NF using objects such as tables, queries, forms, reports and macros.
• Explain how a query can provide a view of a database.
• Describe the difference between a simple and complex query.
• Outline the different methods that can be used to construct a query.
SD
DBMS/RDBMS
• A DBMS is system software for creating and managing databases.
The DBMS provides users and programmers with a systematic way
to create, retrieve, update and manage data.
• A RDBMS is a DBMS that is based on the relational model as
introduced by E. F. Codd. The data in an RDBMS is stored in
database objects which are called as tables.
• Every table is broken up into smaller entities called fields.
• The fields in the CUSTOMERS table consist of ID, NAME, AGE,
ADDRESS and SALARY.
• A record is also called as a row of data is each individual entry that
exists in a table.
• A column is a vertical entity in a table that contains all information
associated with a specific field in a table.
SD
SD
SD
Primary vs Secondary Key
SD
SD
SD
SD
Outline the functions and tools of a DBMS
1. It Organizes Data- Data are organized or structured according to the specifications of the data
definition language.
2. It integrates Data-Data are interrelated or linked together at the element (named field of
data) level and can, therefore, be assembled in many combinations during execution of a
particular application program. The data base management system is the vehicle used to
collect, combine, and return a portion of the available data to the user.
3. It Separates Data-A DBMS serves as a filter between application programs and their
associated data. It separates application logic from the input/output logic needed to calculate
addresses, follow chains or links, block/unblock data, locate records, and select data elements.
4. It Controls Data-A DBMS appears to an application programmer to be an extension of the
operating system software. As it receives data storage requests from host programs, it controls
how and where data are physically stored. On data retrievals, it locates and returns requested
elements of data to the programs.
5. It Retrieves Data-A record of data can be obtained via a DBMS- (1) Serially (in its physically
stored sequence), (2) Sequentially, according to the value of a user-specified key, (3) Randomly
by key, (4) Randomly by address, and (5) By structural link. All or any portion of the data
record can be returned to the user.
6. It Protects Data-A DBMS protects and secures both the content of a data base and the
relationships of data elements. Data are protected against access by unauthorized users,
physical damage, operating system failure, simultaneous updating, and certain interruptions
initiated by a host program.
SD
Describe how a DBMS can be used to promote
data security
• ACCESS CONTROL Authorization Authentication
• data locking is locking the variable which is required by you and
using it shared and exclusive lock 2phase locking
• Encryption: A DBMS can support encryption mechanisms to
protect data in transit and at rest. Encryption can be used to
ensure that data is transmitted securely over networks and stored
securely on disk or in memory.
• Audit Trail: A DBMS can maintain an audit trail of all activities
that occur in the system. The audit trail can record all changes to
data, all login attempts, and other security-related events.
• Backup and Recovery: A DBMS can support backup and
recovery mechanisms to protect against data loss or corruption.
SD
Database Schema
• A database schema is the skeleton structure
that represents the logical view of the entire
database.
• It defines how the data is organized and how
the relations among them are associated. It
formulates all the constraints that are to be
applied on the data.
• A database schema defines its entities and the
relationship among them.
SD
Schema
• A database schema can be divided broadly into two
categories –
– Physical Database Schema - This schema pertains to the actual
storage of data and its form of storage like files, indices, etc. It
defines how the data will be stored in a secondary storage.
– Logical(Conceptual) Database Schema - This schema defines all
the logical constraints that need to be applied on the data
stored. It defines tables, views, and integrity constraints.
– Conceptual Database Schema- identifies the highest-level
relationships between the different entities.
• important entities and the relationships among them.
• No attribute is specified.
• No primary key is specified.
SD
Conceptual Schema
SD
Logical Schema
SD
Physical Schema
SD
Conceptual Logical schema Physical schema
A high level model that shows A more detailed model A low level data model
what data is to be stored and shows how system will that describes how data
the relationships among those be implemented will be stored and
data regardless of the data accessed in the physical
platform storage medium
To provide a big picture view To provide a logical To give a detailed view
of what data should be structure of the data of how the data will be
present in the system needed by the physically designed for
organization optimal performance and
cost
Entities, relationships and key Entities, attributes, Tables, columns, data
identifiers primary keys, foreign types, indexes,
keys and relationships constraints, triggers,
procedures, tasks and
streams.
Technology independent Technology Technology dependent
independent
SD
Data dictionary
• Data structure that stores metadata. Gives comprehensive
description of each field in the database like field name, data type,
data format, field size, description, and example.
• Data dictionary: data format
– any special formatting applied to the field; e.g. number of decimal places in
number fields, short, medium and long date and time formats.
• Data dictionary: data type
– Includes text, number, date, currency, yes/no (Boolean).
SD
SD
Explain the importance of a data definition
language(DDL) in implementing a data model.
• Data Modelling:
– Defines and analyses data requirements creates a blueprint for the
construction and maintenance of a database Starts from the conceptual
model to the logical model. Then the logical model is implemented into
a tech dependent physical model.
– Data Definition Language
• CREATE
• DROP
• ALTER
• How to design physical schema
– Convert entities into tables.
– Convert relationships into foreign keys.
– Convert attributes into columns.
– Modify the physical data model based on physical constraints/
requirements
SD
Explain the importance of data modelling in the
design of a database
• Designing a model creates formal communication
links between the analysts and the stakeholders
on the one hand and the analyst and
implementers on the other. The stakeholders can
analyse the conceptual level to ensure the
requirement are well captured.
• Errors can easily be identified. Implementers also
have an easy task of converting the blueprint into
a database Also, maintenance becomes easier.
SD
Why are databases beneficial?
• When designed properly, databases help to:
1. Find information quickly because searching for info or
relationships between info is much faster using a database.
Because of things like search queries, using a database would
be more time saving than using a book
2. More than one user can have access to the data at once. This is
great for organizations with several employees who need to
access the same data at once.
3. With a well designed database, data redundancy is avoided
4. Flexibility, as in using different views, or custom data
representations so that the user can view the data that benefits
them most. It can be referenced by many different applications.
5. Longevity, as in it can be viewed in a DBMS software as it is
released.
SD
Define the following database terms
• Tables - In relational data model, relations are saved in the format of Tables. This format stores the
relation among entities. A table has rows and columns, where rows represents records and columns
represent the attributes.
• Tuple - A single row of a table, which contains a single record for that relation is called a tuple.
• Relation instance - A finite set of tuples in the relational database system represents relation instance.
Relation instances do not have duplicate tuples.
• Relation key - Each row has one or more attributes, known as relation key, which can identify the row
in the relation (table) uniquely.
• Primary Key - A primary key is one of the candidate keys chosen by the database designer to uniquely
identify the entity set.
• Candidate Key - A minimal super key is called a candidate key. An entity set may have more than one
candidate key. - is a field, or set of fields, in a table that can uniquely identify a database record. Each
table may have one or more candidate keys, but the best one is called the primary key.
• Field: It is equivalent to a column in a table..
• A foreign key is a key attribute of a relation that can be referred in other relation.- is a field (or group of
fields) in one table that links to the primary key in another table.
• A composite key is a set of more than one key that, together, uniquely identifies each record.
• Secondary Key. An entity may have one or more choices for the primary key. Collectively these are
known as candidate keys. One is selected as the primary key. Those not selected are known as
secondary keys.
• Join- used to match records from two or more tables. The tables involved must be joined by at least
one common field SD
DATA STRUCTURE
Identify the different types of relationships within
databases: one-to-one, one-to-many, many-to-many
• Relationship - The logical association among entities is called relationship.
Relationships are mapped with entities in various ways. Mapping
cardinalities define the number of association between two entities.
• Relationship Set
– A set of relationships of similar type is called a relationship set. Like entities, a
relationship too can have attributes. These attributes are called descriptive
attributes.
• Degree of Relationship
• The number of participating entities in a relationship defines the degree of
the relationship.
– Binary = degree 2
– Ternary = degree 3
– n-ary = degree
• Cardinality defines the number of entities in one entity set, which can be
associated with the number of entities of other set via relationship set.
SD
• One-to-one - One entity from entity set A
Relationship
can be associated with at most one entity
of entity set B and vice versa.
• One-to-many - One entity from entity set A
can be associated with more than one
entities of entity set B however an entity
from entity set B, can be associated with at
most one entity.
• Many-to-one - More than one entities
from entity set A can be associated with at
most one entity of entity set B, however an
entity from entity set B can be associated
with more than one entity from entity set
A.
• Many-to-many - One entity from A can be
associated with more than one entity from
B and vice versa.
SD
Review of the Basic Design Rules of
Relational Databases
Whenever you find yourself numbering the fields in this way, it’s a sign
you’re on the wrong track
30
SD
Relating Yards and Birdfeeders
Bird Feeders Yards
Bird Feeder Number Yard Number
Material Owner
Location in Yard Address
Yard Number Phone Number
Many to Many
Relationship
Sales Database
CUSTOMERS PRODUCTS
Customer ID Product ID
First Product
Last Supplier
Address Description
City Units
State Cost
Zip Price
Customer1
Customer2
Customer3
One Product can be
......
purchased by an unlimited
number of Customers.
Sales Database
CUSTOMERS PRODUCTS
Customer ID Product ID
First Product
Last Supplier
Address Description
City Units
State Cost
Zip Price
Product1
Product2
Product3
...... One Customer can
purchase an unlimited
number of Products.
Sales Database
Main Table Main Table
CUSTOMERS PRODUCTS
Customer ID Product ID
First Product
Last Supplier
Address Description
City Units
State Cost
Junction Table
Zip Price
SALES
Sales ID
Customer ID
Product ID
Date
Quantity
Sales Database
CUSTOMERS PRODUCTS
1 1
Customer ID Product ID
First Product
Last Supplier
Address Description
City Units
State Cost
Zip Price
SALES
Sales ID
One Customer Customer ID One Product can
can have many
∞ Product ID be sold many
sales
Date ∞ times
Quantity
Sales Database
Products by Customer
First ____________ Last __________________
Address ______________________________________
City_____________ State ____ Zip Code ________
Command Description
Creates a new table, a view of a table, or other object in the
CREATE
database.
SD
DML - Data Manipulation Language
Command Description
SD
DCL - Data Control Language
Command Description
SD
Outline the issues caused by redundant
data
Duplication of data is data redundancy.
• Redundant Storage
– Some information is stored repeatedly.
• Update Anomalies
– If one copy of such repeated data is updated, an inconsistency is
created, unless all copies are similarly updated.
• Insertion anomalies
– It may not be possible to store certain information unless some
other, unrelated, information is stored.
• Deletion Anomalies
– It may not be possible to delete certain information without
losing some other, unrelated, information.
SD
What are the problems in data redundancy?
SD
What does Data Redundancy mean in database
design?
• Data redundancy is capturing the same data in
multiple tables. Ideally, you should only store
a particular data point in one table within a
database.
• It leads to data inconsistency
SD
What kind of problems are associated with
redundancy in databases?
• problems associated with redundancy in data
base, Redundancy occurs in same data
multiple time tends to several problems some
times redundancy controlling is necessary to
improve the performance of the query.
SD
How does DBMS reduce data
redundancy?
• DBMS stands for database management
system. DBMS reduce data redundancy as it
checks if the data is duplicate and if duplicate
then store it as a single record.
SD
Can data redundancy be completely eliminated when
database approach is used?
SD
What is the difference between data integrity
and data redundancy?
• Redundancy A volume that maintains redundant
data. Extended by capability: RAID5 Capability.
Extended by capability: Data Mirroring.
• Integrity The accuracy and consistency of stored
data, indicated by an absence of any alteration in
data between two updates of a data record. Data
integrity is imposed within a database at its design
stage through the use of standard rules and
procedures, and is maintained through the use of
error checking and validation routines
SD
Database Normalization
SD
Normalization in DBMS
• Normalization is a process of organizing the
data in database to avoid data redundancy.
– insertion anomaly,
– update anomaly &
– deletion anomaly.
SD
Anomalies in DBMS
• Example: Suppose a manufacturing company stores the employee details
in a table named employee that has four attributes: emp_id for storing
employee’s id, emp_name for storing employee’s name, emp_address for
storing employee’s address and emp_dept for storing the department
details in which the employee works. At some point of time the table looks
like this:
emp_id emp_name emp_address emp_dept
101 Rick Delhi D001
101 Rick Delhi D002
123 Maggie Agra D890
166 Glenn Chennai D900
166 Glenn Chennai D004
SD
Cont..
• Update anomaly: In the table we have two rows for employee Rick as he
belongs to two departments of the company. If we want to update the
address of Rick then we have to update the same in two rows or the data
will become inconsistent. If somehow, the correct address gets updated in
one department but not in other then as per the database, Rick would be
having two different addresses, which is not correct and would lead to
inconsistent data.
• Insert anomaly: Suppose a new employee joins the company, who is
under training and currently not assigned to any department then we
would not be able to insert the data into the table if emp_dept field
doesn’t allow nulls.
• Delete anomaly: Suppose, if at a point of time the company closes the
department D890 then deleting the rows that are having emp_dept as
D890 would also delete the information of employee Maggie since she is
assigned only to this department.
SD
To overcome these anomalies we need
to normalize the data.
• Here are the most commonly used normal
forms:
– First normal form(1NF)
– Second normal form(2NF)
– Third normal form(3NF)
SD
Purpose of Normalisation
The main purpose of database normalization is to avoid complexities, eliminate duplicates, and organize data in
a consistent way. In normalization, the data is divided into several tables linked together with relationships.
Database administrators are able to achieve these relationships by using primary keys, foreign keys, and
composite keys.
a primary key in one table, is related to the value from another table.
A primary key is a column that uniquely identifies the rows of data in that table. It’s a unique identifier such as
an employee ID, student ID, voter’s identification number (VIN), and so on.
A foreign key is a field that relates to the primary key in another table.
A composite key is just like a primary key, but instead of having a column, it has multiple columns.
SD
In short
SD
SD
SD
First Normal Form (1NF)
• Define the data items required, because they
become the columns in a table.
• Place the related data items in a table.
• Ensure that there are no repeating groups of
data.(no duplicated rows or columns)
• Ensure that there is a primary key.-there must be a
primary key for identification
• a single cell must not hold more than one value
(atomicity)
• each column must have only one value for each row
in the table
SD
First normal form (1NF)
• As per the rule of first normal form, an attribute (column) of a
table cannot hold multiple values. It should hold only atomic
values.
• Example: Suppose a company wants to store the names and
contact details of its employees.
emp_id emp_name emp_address emp_mobile
101 Herschel New Delhi 8912312390
8812121212
102 Jon Kanpur
9900012222
103 Ron Chennai 7778881212
9990000123
104 Lester Bangalore
8123450987
SD
Cont..
• This table is not in 1NF as the rule says “each attribute of a
table must have atomic (single) values”, the emp_mobile
values for employees Jon & Lester violates that rule
• To make the table complies with 1NF.
SD
Second Normal Form (2NF)
• The 1NF only eliminates repeating groups,
not redundancy. That’s why there is 2NF.
• The 2NF states that it should meet all the rules
for 1NF- it’s already in 1NF
• has no partial dependency. That is, all non-key
attributes are fully dependent on a primary key.
• No non-prime attribute is dependent on the
proper subset of any candidate key of table.
• An attribute that is not part of any candidate key
is known as non-prime attribute.
SD
Cont..
• Example: Suppose a school wants to store the data
of teachers and the subjects they teach. They create
a table that looks like this: Since a teacher can teach
more than one subjects, the table can have multiple
rows for a same teacher.
teacher_id subject teacher_age
111 Maths 38
111 Physics 38
222 Biology 38
333 Physics 40
333 Chemistry 40
Candidate Keys: {teacher_id, subject}
Non prime attribute: teacher_age SD
2NF
• The table is in 1 NF because each attribute has atomic values. However, it
is not in 2NF because non prime attribute teacher_age is dependent on
teacher_id alone which is a proper subset of candidate key. This violates
the rule for 2NF as the rule says “no non-prime attribute is dependent on
the proper subset of any candidate key of the table”.
• To make the table complies with 2NF we can break it in two tables like this:
teacher_details table: teacher_subject table:
SD
Third Normal form (3NF)
• A table design is said to be in 3NF if both the following conditions
hold:
– Table must be in 2NF.
– Transitive functional dependency of non-prime attribute on any super
key should be removed.
• An attribute that is not part of any candidate key is known as
non-prime attribute.
• In other words 3NF can be explained like this: A table is in 3NF if it
is in 2NF and for each functional dependency X-> Y at least one of
the following conditions hold:
– X is a super key of table.
– Y is a prime attribute of table.
• An attribute that is a part of one of the candidate keys is known as
prime attribute.
SD
Cont..
• Example: Suppose a company wants to store
the complete address of each employee, they
create a table named employee_details
emp_id emp_name emp_zip emp_state emp_city emp_district
1001 John 282005 UP Agra Dayal Bagh
1002 Ajeet 222008 TN Chennai M-City
1006 Lora 282007 TN Chennai Urrapakkam
1101 Lilly 292008 UK Pauri Bhagwan
1201 Steve 222999 MP Gwalior Ratan
SD
Dependency Diagram
• Dependency Diagram
– The primary key components are bold, underlined, and shaded in a
different color.
– The arrows above entities indicate all desirable dependencies, i.e.,
dependencies that are based on PK.
– The arrows below the dependency diagram indicate less desirable
dependencies -- partial dependencies and transitive dependencies.
SD
Second Normal Form (2 NF)
• Conversion to Second Normal Form
– Starting with the 1NF format, the database can be converted
into the 2NF format by
• Writing each key component on a separate line, and then
writing the original key on the last line and
• Writing the dependent attributes after each new key.
SD
Dependency Diagram
Second Normal Form (2 NF)
A table is in 2NF if:
• It is in 1NF and
• It includes no partial dependencies; that is,
no attribute is dependent on only a portion
of the primary key.
(It is still possible for a table in 2NF to
exhibit transitive dependency; that is, one or
more attributes may be functionally
dependent on nonkey attributes.)
SD
Third Normal Form (3 NF)
SD
Third Normal Form (3 NF)
• 3NF Definition
– A table is in 3NF if:
• It is in 2NF and
• It contains no transitive dependencies.
SD
Syntax in SQL
• SELECT Statement
• SELECT column1, column2....column FROM table_name;
• WHERE Clause
• SELECT column1, column2....column FROM table_name WHERE
CONDITION;
• AND/OR Clause
• SELECT column1, column2....column FROM table_name
WHERE CONDITION-1 {AND|OR} CONDITION-2;
• BETWEEN Clause
• SELECT column1, column2....column FROM table_name WHERE
column_name BETWEEN val-1 AND val-2;
• COUNT Clause
SELECT COUNT(column_name) FROM table_name WHERE
CONDITION;
SD
SQL Command
• CREATE TABLE Statement
CREATE TABLE table_name (
column1 datatype,
column2 datatype,
column3 datatype,
.....
columnN datatype,
PRIMARY KEY( one or more columns )
);
• DROP TABLE Statement
DROP TABLE table_name;
SD
SQL Command
• ALTER TABLE Statement
ALTER TABLE table_name {ADD|DROP|MODIFY} column_name
{data_ype};
• ALTER TABLE Statement (Rename)
ALTER TABLE table_name RENAME TO new_table_name;
• INSERT INTO Statement
INSERT INTO table_name( column1, column2....columnN) VALUES (
value1, value2....valueN);
• UPDATE Statement
UPDATE table_name SET column1 = value1, column2 =
value2....columnN=valueN [ WHERE CONDITION ];
• DELETE Statement
DELETE FROM table_name WHERE {CONDITION};
SD
Data Types-Exact Numeric
DATA TYPE FROM TO
SD
Date and Time Data Types
SD
Character Strings Data Types
DATA TYPE Description
Maximum length of 8,000 characters.( Fixed length
char
non Unicode characters)
SD
SD
Convert into 1NF
SD
SD
Total(Involved Compulsory)and - Single line
partial(Not involved Compulsory) participation
SD
Student(Student_ID,Name,Age,email)
Course(Course_ID,title,credit)
StuCourse(Student_ID,Course_ID,grade)
SD
Customer(customerID,name,age)
Order(orderID,date,total,customerID)
SD