0% found this document useful (0 votes)
21 views276 pages

Database Management System.

The document outlines the curriculum for BCA Semester 2, focusing on Database Management Systems (DBMS). It introduces key concepts, characteristics, and the importance of DBMS, highlighting its role in managing data effectively, ensuring data integrity, and supporting multiple users. Additionally, it details the components of DBMS and their functionalities, emphasizing the advantages of using a database approach over traditional file-based systems.

Uploaded by

sethinupoor931
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views276 pages

Database Management System.

The document outlines the curriculum for BCA Semester 2, focusing on Database Management Systems (DBMS). It introduces key concepts, characteristics, and the importance of DBMS, highlighting its role in managing data effectively, ensuring data integrity, and supporting multiple users. Additionally, it details the components of DBMS and their functionalities, emphasizing the advantages of using a database approach over traditional file-based systems.

Uploaded by

sethinupoor931
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 276

BCA SEMESTER 2

❇ Click the link below to join the BCA


Batch-8 Main Community:
(https://chat.whatsapp.com/
DgVLGa24pLNJpc2Awnmzul)

DCA1208
DATABASE MANAGEMENT SYSTEM
Unit: 1 - Introduction to DBMS 1
DCA1208: Database Management System

❇ Click the link below to join the BCA


Batch-8 Main Community:
(https://chat.whatsapp.com/
DgVLGa24pLNJpc2Awnmzul)

Unit - 1
Introduction to DBMS

DCA324
KNOWLEDGE MANAGEMENT
Unit: 1 - Introduction to DBMS 2
DCA1208: Database Management System
TABLE OF CONTENTS

SL Fig No / Table SAQ /


Topic Page No
No / Graph Activity

1 Introduction - -
4
1.1 Objectives - -

2 Concept of Database 1 - 5-6

3 Characteristics of Database Approach - - 7-8

4 What is DBMS - -
9-11
4.1 Importance of DBMS - -

5 Overview of DBMS - -
5.1 Components of DBMS 2 -
12-16
5.2 Key Concepts in DBMS - -
5.3 Database System Applications - -

6 Summary - - 17-18

7 Glossary - - 19-20

8 Self-Assessment Questions - 1 21-24

9 Terminal Questions - - 25

10 Answers - -
10.1 Self-Assessment Questions - - 26-27
10.2 Terminal Questions - -

11 References - - 28

Unit: 1 - Introduction to DBMS 3


DCA1208: Database Management System

1. INTRODUCTION
In this unit, we will introduce the basic concepts of DBMS. Database technology development has
evolved rapidly over the past three decades since relational database systems' rise and eventual
dominance. While many specialised database systems (spatial, object-oriented, multimedia, etc.) have
found substantial user communities in the science and engineering sections, relational systems
remain the preferred database technology for business enterprises.

A database is a collection of related information stored to be available to several users for various
purposes. The content of a database is obtained by combining data from all the different sources in
an organisation so that data are available to all users and replicated data can be minimised or
eliminated. A computer database gives us an electronic filing system, which has many ways of cross-
referencing and allows the user several ways to retrieve and reorganise data.

1.1. Objectives
By the end of Unit 1, the learners should be able to:
• Define Database management system
• List the characteristics of the Database Approach
• Discuss the importance of Database Management
Systems
• Classify the components of DBMS
• Identify the Applications of Database Systems

Unit: 1 - Introduction to DBMS 4


DCA1208: Database Management System

2. CONCEPT OF DATABASE
The database is a collection of related data. A data item is the smallest identified unit of data that has
value in the real world – for example, last name, first name, street address, ID number, or political
party – and is the fundamental component of a file in a file system. A group of related data items
considered a single unit by an application is called a record. Examples of types of records are
salesperson, customer, order, product,and department. A file is a collection of several records of a
single type.

Fig.1 Database

A database has the following properties:


• A database represents an aspect of the real world, sometimes called the universe of discourse
(UoD) or mini world. The database keeps track of changes to the mini world.
• A database is a logically organised collection of data with some meaning.
• A database is designed, built, and populated with relevant data for a specific purpose.

A database is a more complex object; it is a collection of interrelated stored data that facilitates the
requirements of several users within one or more organisations, that is, interrelated collections of
many different types oftables. The encouragement for using databases rather than files includesless
redundancy of data, greater availability to a diverse set of users, and integration of data for easier
access to and updating complextransactions.

The basic definitions of some database concepts are:


Data: Data is a collection of known facts that may be recorded and have an underlying significance.

Database: It is a collection of related data.

Unit: 1 - Introduction to DBMS 5


DCA1208: Database Management System

Database System: It consists of the DBMS software and the data itself.Sometimes, the applications
are also included.

Database Management System (DBMS): A software package/system facilitates creating and


maintaining a computerised database.

Database management systems software examples include Oracle, SQL Server, MS Access, DB2,
SYBASE, and INFORMIX.

A database can handle accounting, filing, and business inventory and use the information in its files
to prepare summaries, estimates, and other reports. There can be a database that stores books,
newspaper articles, magazines, and comics. There is already a well-defined market for specialised
knowledge for a small number of customers on practically every issue.

The management of data in a database system is handled by a database management system, which
is a general-purpose software programme. The database management system is the major software
component of a database system. Therefore, a database management system is a combination of
hardware and software that can be used to set up and monitor a database and can manage the
retrieval and updating of the database that has been stored in it. Most database management systems
have the following capabilities:
• Creating a file, addition to data, modification of data, deletion of data,creation, addition, and
deletion of files.
• Retrieving data selectively or collectively.
• The data can be sorted or indexed at the user's discretion and direction.
• The system can generate several reports. These may be standardised reports or generated
explicitly according to user requirements.
• Mathematical functions can be performed on data stored in the database and manipulated
using these functions to perform the desired calculations.

Unit: 1 - Introduction to DBMS 6


DCA1208: Database Management System

3. CHARACTERISTICS OF DATABASE APPROACH


Here are several characteristics that distinguish the database approach from the file-based system.
This discussion outlines the benefits and features of database systems:
• Self-Describing Nature of a Database System: A database system is self-describing because it
contains not only the data itself but also metadata that defines and describes the data and
relationships between tables in the database. This metadata is used by the DBMS software or
database users when needed. This separation of data and metadata differentiates a database
system from the traditional file-based system, where data definitions are embedded in
application programs.
• Insulation Between Program and Data: In file-based systems, data file structures are defined
within application programs. If a file's structure changes, all programs accessing that file may
need to be modified. In contrast, the database approach stores data structures in the system
catalogue, separate from the programs. Thus, changing the data structure requires only a single
update, known as program-data independence.
• Support for Multiple Views of Data: Databases support multiple views and subsets of the
database dedicated to specific users or groups. Different users can have customised views of the
system, each containing only the data relevant to them.
• Sharing of Data and Multiuser System: Modern database systems are designed for multiple
users, allowing concurrent access to the same database. Concurrency control strategies ensure
data integrity and correctness even when accessed simultaneously by multiple users. This is a
significant improvement over older systems that restricted usage to one user at a time.
• Control of Data Redundancy: In the database approach, each data item is ideally stored in only
one place. While some redundancy might be introduced to enhance performance, it is controlled
and minimised during database design.
• Data Sharing: Integrating all organisational data within a database system facilitates data
sharing among authorised users and enables users to generate more information from the data
than is possible without such integration.
• Enforcement of Integrity Constraints: Database management systems allow users to define
and enforce constraints to ensure the validity and integrity of data. Constraints can range from
data types and uniqueness to complex rules ensuring that only valid data is entered.

Unit: 1 - Introduction to DBMS 7


DCA1208: Database Management System

• Restriction of Unauthorised Access: Not all users have the same access privileges in a
database system. Some users have read-only access, while others have read and write privileges.
DBMSs provide a security subsystem to create and control different user accounts, restricting
unauthorised access.
• Data Independence: A significant advantage of DBMSs is data independence. System data
descriptions (metadata) are separated from application programs, and changes to data
structures are managed by the DBMS rather than embedded in the programs.
• Transaction Processing: DBMSs include concurrency control subsystems to ensure data
consistency and validity during transaction processing, even when multiple users update the
same information simultaneously.
• Provision for Multiple Views of Data: DBMSs allow multiple users to access the database
individually or simultaneously without knowing how and where the data is stored.
• Backup and Recovery Facilities: Backup and recovery methods protect data from loss. DBMSs
provide processes for backing up and recovering data independently of network backups.
Recovery subsystems ensure the database can be restored to its original state if a system fails
during a complex update process.

These characteristics and features highlight the advantages of the database approach over traditional
file-based systems

Unit: 1 - Introduction to DBMS 8


DCA1208: Database Management System

4. WHAT IS DBMS AND ITS IMPORTANCE


DBMS is a general-purpose software system that facilitates the processes of defining, constructing,
manipulating, and sharing databases among various users and applications.

Defining a database involves specifying the data types, structures, and constraints of the data to be
stored in the database. The DBMS also stores the database definition or descriptive information as a
database catalogue or dictionary called meta-data.

Constructing the database is the process of storing the data on a storage medium controlled by the
DBMS.

Manipulating a database includes functions such as querying it to retrieve specific data, updating it
to reflect changes in the mini world, and generating reports from the data.

Sharing a database allows multiple users and programs to access the database simultaneously.

Database management systems software examples include Oracle, SQL Server, MS Access, DB2,
SYBASE, and INFORMIX.

A database can handle accounting, filing, and business inventory and use the information in its files
to prepare summaries, estimates, and other reports. There can be a database that stores books,
newspaper articles, magazines, and comics. There is already a well-defined market for specialised
knowledge for a small number of customers on practically every issue.

The management of data in a database system is handled by a database management system, which
is a general-purpose software programme. The database management system is the major software
component of a database system. Therefore, a database management system is a combination of
hardware and software that can be used to set up and monitor a database and can manage the
retrieval and updating of the database that has been stored in it. Most database management systems
have the following capabilities:
• Creating a file, addition to data, modification of data, deletion of data,creation, addition, and
deletion of files.
• Retrieving data selectively or collectively.
• The data can be sorted or indexed at the user's discretion and direction.

Unit: 1 - Introduction to DBMS 9


DCA1208: Database Management System

• The system can generate several reports. These may be standardised or specifically generated
according to specific user requirements.
• Mathematical functions can be performed on data stored in the database and manipulated
using these functions to perform the desired calculations.

4.1. Importance of DBMS (Database Management Systems)


A Database Management System is crucial for effectively managing and utilising data in various
applications. The importance of DBMS can be understood through its numerous benefits and
functionalities:

1. Data Integrity and Security: DBMS ensures data accuracy and consistency through constraints
and rules that prevent invalid data entry. It also provides robust security mechanisms, including
authentication and encryption, to protect data from unauthorised access.
2. Efficient Data Management: DBMS organises data in a structured manner, making it easy to
retrieve, update, and manage. Efficient query processing and indexing techniques in DBMS
enable fast and accurate data retrieval.
3. Data Abstraction and Independence: DBMS abstracts the complexities of data storage from
the end users, allowing them to interact with data at a higher level. DBMS supports logical and
physical data independence, allowing changes in the schema at one level without affecting other
levels.
4. Concurrency Control: DBMS allows multiple users to access and manipulate data
simultaneously without interference, ensuring data consistency and isolation. DBMS manages
transactions to ensure that all operations within a transaction are completed successfully,
maintaining database integrity.
5. Data Redundancy Control: DBMS reduces data redundancy by integrating data into a single
database system, eliminating duplicate data entries. The normalisation process in DBMS helps
minimise redundancy and improve data integrity.
6. Backup and Recovery: DBMS provides automated backup mechanisms to safeguard data
against accidental loss or corruption. In case of a system failure, DBMS has recovery procedures
to restore the database to its last consistent state.
7. Data Sharing: DBMS supports data sharing among multiple users and applications, promoting
collaborative data usage. It integrates data from various sources, providing a unified view of the
data for decision-making and analysis.

Unit: 1 - Introduction to DBMS 10


DCA1208: Database Management System

8. Improved Data Access: DBMS uses query languages like SQL to enable users to interact with
the database easily, making data access more efficient. It supports various reporting and
analysis tools that help generate meaningful insights from the data.
9. Support for Data Integrity Constraints: DBMS enforces integrity constraints (e.g., primary
keys, foreign keys) to maintain the accuracy and reliability of data. It provides mechanisms to
define and enforce data validation rules, ensuring data quality.

Unit: 1 - Introduction to DBMS 11


DCA1208: Database Management System

5. OVERVIEW OF DBMS
DBMS is a general-purpose software system that facilitates the processes of defining, constructing,
manipulating, and sharing databases among various users and applications.

5.1. Components of DBMS


A database management system (DBMS) consists of several components.

Each component plays a very important role in the database management system environment. The
significant components of a database management system are:

Fig.2 Components of DBMS

HARDWARE: The hardware is the computer system used to keep and access the database. It consists
of physical and electronic devices such as computers (together with associated I/O devices like disk
drives), storage devices, I/O channels, electromechanical devices that interface between computers
and real-world systems, and so on. It is impossible to implement the DBMS without the hardware
devices; in a network, a powerful computer with high data processing speed and a storage device with
large storage capacity is required as the database server.

SOFTWARE: The main component of a DBMS is the software. It is the set of programs used to handle
the database and to control and manage the overall computerised database. DBMS software itself is
the most important software component in the overall system.

DATA: Data is the most important component of the DBMS. The main purpose of DBMS is to process
data. In DBMS, databases are defined and constructed, and then data is stored, updated, and retrieved
to and from the databases. The database contains both the actual data and the metadata.

Unit: 1 - Introduction to DBMS 12


DCA1208: Database Management System

PROCEDURES: Procedures refer to the instructions and rules that help design the database and use
the DBMS. The users who operate and manage the DBMS require documented procedures when they
use it hot or run the database management system.

USERS: Database users are the people who manage the databases and perform various operations on
them in the database system. Three kinds of people play different roles in database systems.
• Application Programmers: These users implement specific application programs to access the
stored data. They must be familiar with the DBMSs to accomplish their task.
• Database Administrators: These may be one person or a group of people in an organisation
responsible for authorising access to the database, monitoring its use, and managing all the
resources necessary to support the use of the entire database system.
• End-Users: End users are the people whose jobs require access to a database for querying,
updating and generating reports.

5.2. Key Concepts in DBMS


To efficiently manage data using a DBMS, let us understand certain key terms:
Database Schema: Database Schema is the design of a database. The skeleton of the database
represents the structure (table names and their fields/columns), the type of data each column can
hold, constraints on the data to be stored (if any), and the relationships among the tables.

The database schema is also called visual or logical architecture, as it tells us how the data are
organised in a database.

Data Constraint Sometimes, we restrict or limit the type of data that can be inserted in one or more
columns of a table by specifying one or more constraints on that column(s) while creating the table.
Constraints are used to ensure the accuracy and reliability of data in the database.

Meta-data or Data Dictionary The database schema, along with various constraints on the data, is
stored by DBMS in a database catalogue or dictionary, called meta-data. Meta-data is data about the
data.

Database Instance When we define the database structure or schema, the database's state is empty,
i.e., no data entry is present. After loading data, the state or snapshot of the database at any given time
is the database instance. We may then retrieve data through queries or manipulate data through

Unit: 1 - Introduction to DBMS 13


DCA1208: Database Management System

updation, modification, or deletion. Thus, the database's state can change; thus, a database schema
can have many instances at different times.

Query A query is a request to a database to obtain information in a desired way. A query can get data
from one table or a combination of tables. For example, “find names of all those students present on
Attendance Date 2000-01-02” is a query to the database. The user must write a query using a query
language to retrieve or manipulate data.

Data Manipulation Modification of the database consists of three operations viz. Insertion, Deletion
or Update. Suppose Rivaan joins as a new student in the class then the student details need to be
added in STUDENT as well as in GUARDIAN files of the Student Attendance database. This is called
Insertion operation on the database. If a student leaves the school, his/her data and guardian details
must be removed from the STUDENT, GUARDIAN and ATTENDANCE files, respectively. This is called
the Deletion operation on the database. Suppose Atharv’s Guardian has changed his mobile number;
his GPhone should be updated in the GUARDIAN file. This is called the Update operation on the
database.

Database Engine Database engine is the underlying component or set of programs a DBMS uses to
create a database and handle various queries for data retrieval and manipulation.

5.3. Database System Applications


Databases are widely used. Here are some of the representative applications:
• Universities: Databases serve as foundational tools across various sectors, facilitating the
efficient management and retrieval of vast amounts of structured data. In universities, databases
are instrumental in handling student information, managing course registrations, scheduling
exams, and recording grades. These systems streamline administrative tasks and enhance
student services by providing centralised access to academic records.
• Credit card transactions: These transactions are for purchases on credit cards and the
generation of monthly statements and payments.
• Banking: For accounts and loans, customer information, and banking transactions
online/offline.
• Finance: This stores information regarding financial instrument holdings, sales, and purchases,
such as stocks and bonds.

Unit: 1 - Introduction to DBMS 14


DCA1208: Database Management System

• Sales: Databases store essential information about customers, products, and purchases in the
sales and retail sectors. These systems support sales analytics, inventory management, and
personalised customer experiences by tracking buying patterns and preferences, optimising
marketing strategies, and operational efficiency.
• Telecommunication: Telecommunication companies rely on databases to manage vast
volumes of data related to customer calls, prepaid card balances, monthly billing, and network
infrastructure. These systems ensure seamless communication services, enabling accurate
billing and customer support across global networks.
• Manufacturing: In manufacturing, databases oversee inventory management across
warehouses, track supply chain operations, monitor production processes, and manage orders
for various products. This integration optimises manufacturing workflows, enhances inventory
accuracy, and supports efficient logistics and distribution strategies.
• Human resources: Human resources departments utilise databases to maintain employee
records, manage payroll taxes, administer benefits, and process payroll. These systems
streamline HR operations, ensure compliance with employment regulations, and facilitate
timely and accurate payroll processing, thereby supporting employee satisfaction and
organisational efficiency.
• Airlines: Airlines pioneered geographically distributed databases, enabling terminals
worldwide to access centralised systems for reservations, cancellations, and schedule
management. This approach revolutionised global travel operations by ensuring real-time
updates and seamless passenger service across diverse locations.

So, we can say that databases are essential to almost all enterprises today. During the last four
decades of the twentieth century, the useof databases grew in all enterprises. Few people engaged
directly with database systems in the early days, but they did so indirectly through printed reports
such as credit card statements or through agents such as bank tellers and airline reservation agents
– without recognising it. Then automated teller machines came along and let users interact directly
with databases. Phone interfaces to computers also facilitated users to deal with databases directly.
For example – a caller could dial a number and press phone keys to enter information or select
alternative options to find flight arrival/departure times.

The Internet revolution sharply increased direct user access to databases during the late 1990s.
Organisations converted many phone interfaces to databases into Web interfaces and facilitated

Unit: 1 - Introduction to DBMS 15


DCA1208: Database Management System

different online services and information. For example, when you access a website, information about
you may be retrieved from a database to selectwhich advertisements should be shown.

When you explore a book or music collection in an online retailer, you access data stored in a database.

When you place an online order, it is saved in a database.

The data is pulled from the bank's database system when you visit a bank's website to get your
account balance and transaction history. Moreover, data related to Web access may be stored in a
database. Thus, although user interfaces hide details of access to a database, and most people are
unaware they are dealing with a database, accessing databases forms an essential part of almost
everyone’s life these days. The importance of database systems can be judged in another way – today,
database system vendors like Oracle are among the largest software companies in the world, and
database systems form an essential part of the product line of more diversified companies like IBM
and Microsoft.

Unit: 1 - Introduction to DBMS 16


DCA1208: Database Management System

6. SUMMARY
A Database Management System (DBMS) is a software system that enables users to define, create,
maintain, and control access to databases. It provides an efficient and systematic way to manage data,
allowing for data storage, retrieval, and manipulation in a structured manner. DBMSs are essential
for managing large volumes of data and ensuring that data is accurate, consistent, and secure. They
form the backbone of various applications across different industries, enhancing data management
and operational efficiency.

The database approach offers several key characteristics that distinguish it from traditional file-based
systems. These include data abstraction, which hides the complexity of data structures; data
independence, allowing changes at one level without affecting others; and reduced data redundancy
through normalisation. The database also enhances data integrity, security, and concurrent access,
enabling multiple users to interact with the data simultaneously without conflicts. These
characteristics collectively improve data management, accessibility, and reliability.

A DBMS is a collection of programs that facilitates database creation, maintenance, and use. It serves
as an intermediary between users and the database, ensuring data is stored efficiently and quickly
retrieved. DBMSs support various functions, including defining data structures, querying data,
updating records, and managing transactions. Popular examples of DBMSs include MySQL, Oracle,
SQL Server, and PostgreSQL, each offering different features tailored to specific requirements.

The importance of DBMSs lies in their ability to provide a systematic and efficient way to handle data.
They ensure data integrity, security, and consistency, which are critical for decision-making and
operational processes. DBMSs also facilitate data sharing among multiple users and applications,
support concurrent data access, and provide robust backup and recovery mechanisms. By minimising
data redundancy and optimising resource utilisation, DBMSs reduce the overall data management
cost and enhance applications' performance and scalability.

DBMSs consist of several key components, including the database engine, schema, data definition
language (DDL), data manipulation language (DML), data control language (DCL), and database
utilities. Key concepts in DBMS include data models, schemas, instances, keys, and normalisation.
DBMS applications span various domains, such as banking, telecommunications, education, and e-
commerce, where efficient data management is crucial. Understanding these components and

Unit: 1 - Introduction to DBMS 17


DCA1208: Database Management System

concepts is essential for leveraging the full potential of DBMSs in building robust and scalable
database systems.

Unit: 1 - Introduction to DBMS 18


DCA1208: Database Management System

7. GLOSSARY

A DBMS is software that interacts with end users, applications, and the

DBMS (Database database to capture and analyse data. It provides an interface for users to

Management - create, retrieve, update, and manage data efficiently. DBMSs ensure data
System) integrity, security, and consistency, allowing multiple users to interact
with the database concurrently.

A database is an organised collection of data stored and accessed


electronically from a computer system. Databases can be complex,
Database - comprising many tables, views, and other objects that store data in a

structured format, supporting easy retrieval and manipulation.

Concurrency in databases refers to the ability of the DBMS to allow


multiple users to access and manipulate data simultaneously without

Concurrency - causing inconsistencies or conflicts. It ensures that transactions are


executed in a way that preserves data integrity and isolation, often using
mechanisms like locking and transaction management.

Metadata is data that describes other data. In the context of databases,


metadata includes information about the structure of the database, such

Metadata - as the schema, tables, columns, data types, constraints, and relationships.
Metadata is stored in a data dictionary or catalog and is crucial for the
DBMS to manage and use the data effectively.

A backup is a copy of database data taken to ensure that it can be restored


in case of data loss, corruption, or system failure. Backup processes can be

Backup - automated and are essential for data recovery and business continuity.
Depending on the backup strategy, they can be full, incremental, or
differential.

Redundancy in databases refers to unnecessary data duplication within


Redundancy - the database. While some redundancy can benefit performance and
reliability, excessive redundancy can lead to data anomalies and increased

Unit: 1 - Introduction to DBMS 19


DCA1208: Database Management System

storage costs. Techniques like normalisation are used to reduce


redundancy and ensure data integrity.

Data sharing in a DBMS allows multiple users and applications to access


and use the data concurrently. This capability supports collaborative

Data Sharing - work environments and integrated operations across different


departments and systems, ensuring users have the necessary data for
their tasks.

SQL is a standardised programming language used to manage and


manipulate relational databases. It includes commands for querying,
SQL (Structured
- updating, inserting, and deleting data and defining and managing the
Query Language) database schema. SQL is the primary language used in DBMSs to interact
with the data.

A schema is the overall logical structure of a database. It defines how data


is organised, including tables, columns, data types, relationships, indexes,
Schema - and constraints. The schema serves as a blueprint for the database,

dictating how data is stored, accessed, and manipulated.

The database engine is the core component of a DBMS and is responsible


for storing, retrieving, and managing data. It handles query processing,

Database Engine - transaction management, and data storage, ensuring operations are
performed efficiently and accurately. The engine includes mechanisms for
indexing, caching, and ensuring ACID properties.

Unit: 1 - Introduction to DBMS 20


DCA1208: Database Management System

8. SELF-ASSESSMENT QUESTIONS

SELF-ASSESSMENT QUESTIONS – 1
Fill in the blanks:
1 What is a database?
a) A collection of interrelated stored data facilitating the requirements of several users within
one or more organisations.
b) A single table containing all the data for an organisation.
c) A software system designed for creating spreadsheets.
d) A programming language used to manage data.
2 What is the smallest identified unit of data that has value in the real world?
a) Record
b) Data item
c) File
d) Database
3 What is a record?
a) A collection of several records of a single type.
b) The smallest identified unit of data.
c) A group of related data items considered a single unit by an application.
d) A logically organised collection of data with some meaning.
4 Which of the following is an example of a record type?
a) Last name
b) Street address
c) Salesperson
d) ID number
5 What does a database system contain besides the data itself?
a) Only the data
b) Only metadata
c) Data and metadata that defines and describes the data and relationships between tables
d) Only application programs

Unit: 1 - Introduction to DBMS 21


DCA1208: Database Management System

6 Which characteristic of a database approach allows different users to have customised views
of the data?
a) Self-describing nature
b) Insulation between program and data
c) Support for multiple views of data
d) Control of data redundancy
7 What is the main software component of a database system?
a) Database
b) Data
c) Database Management System (DBMS)
d) Application programs
8 Which of the following is NOT a capability of most DBMSs?
a) Creating a file
b) Deleting data
c) Sorting or indexing data
d) Building computer hardware
9 What is the role of hardware in a DBMS environment?
a) Provides procedures for managing the database
b) Stores and retrieves data from the database
c) Contains physical and electronic devices used to store and access the database
d) Defines the structure of the database schema
10 Which component of DBMS is responsible for handling the overall computerised database?
a) Procedures
b) Software
c) Users
d) Data
11 What is metadata in a DBMS context?
a) Actual data stored in the database
b) Hardware used to store the database
c) Data about the data, including database schema and constraints
d) Users who manage and operate the DBMS
12 Who are the end-users in a database system?

Unit: 1 - Introduction to DBMS 22


DCA1208: Database Management System

a) Application programmers
b) Database administrators
c) Users responsible for authorising access to the database
d) People whose jobs require access to the database for querying and updating
13 What is a database schema?
a) A set of procedures for managing the database
b) A collection of data stored in the database
c) Data about the data stored in the database
d) The design of a database, including structure and relationships
14 What does a database instance refer to?
a) Data stored in the database
b) The design of the database schema
c) A snapshot of the database state at a specific time
d) Metadata stored in the database
15 What is a query in the context of a DBMS?
a) A request to obtain information from the database
b) A type of database schema
c) Data constraints applied to columns
d) A procedure for manipulating data
16 Which operation is NOT part of data manipulation in a DBMS?
a) Insertion
b) Deletion
c) Update
d) Sorting
17 What is the function of a database engine in a DBMS?
a) It manages user access to the database
b) It stores metadata about the database
c) It handles queries for data retrieval and manipulation
d) It defines the structure of the database schema
18 Which of the following is an example of a database application?
a) Programming languages
b) Operating systems

Unit: 1 - Introduction to DBMS 23


DCA1208: Database Management System

c) Internet browsers
d) Banking transactions
19 How did the Internet revolution impact database usage?
a) Reduced user access to databases
b) Increased indirect interaction with databases
c) Converted phone interfaces to Web interfaces for database access
d) Decreased the role of database vendor
20 What role do automated teller machines (ATMs) play in database usage?
a) They restrict direct user access to databases
b) They facilitate direct interaction with databases
c) They provide physical storage for database servers
d) They store metadata about the database

Unit: 1 - Introduction to DBMS 24


DCA1208: Database Management System

9. TERMINAL QUESTIONS
1. Explain the concept of a database with an example.
2. Why is a DBMS important in modern data management?
3. Provide an overview of a DBMS and its main components.
4. What are the representative applications of Databases? List them.
5. Define a Database.
6. List the characteristics of the Database Approach.
7. Differentiate between Database Schema vs. Database Instance.
8. Explain Metadata.
9. What do you mean by Data Manipulation
10. Define Data constraint.

Unit: 1 - Introduction to DBMS 25


DCA1208: Database Management System

10. ANSWERS

10.1. Self-Assessment Answers


1. a A collection of interrelated stored data facilitating the requirements of several users within
one or more organisations.
2. b Data item
3. c A group of related data items considered a single unit by an application.
4. c Salesperson
5. c Data and metadata that defines and describes the data and relationships between tables
6. c Support for multiple views of data
7. c Database Management System (DBMS)
8. d Building computer hardware
9. c Contains physical and electronic devices used to store and access the database
10. b Software
11. c Data about the data, including database schema and constraints
12. d People whose jobs require access to the database for querying and updating
13. d The design of a database, including structure and relationships
14. c A snapshot of the database state at a specific time
15. a A request to obtain information from the database
16. d Sorting
17. c It handles queries for data retrieval and manipulation
18. d Banking transactions
19. c Converted phone interfaces to Web interfaces for database access
20. b They facilitate direct interaction with databases

10.2. Terminal Answers


Answer 1: A database represents some aspect of the real world, sometimes called the mini-world or
universe of discourse (UoD). (ii) A database is a logically coherent data collection with some inherent
meaning. (Refer to section 2 for more details)

Answer 2: DBMS is a general-purpose software system that facilitates the processes of defining,
constructing, manipulating, and sharing databases among various users and applications (Refer to
section 4 for details)

Unit: 1 - Introduction to DBMS 26


DCA1208: Database Management System

Answer 3: A database management system (DBMS) consists of several components. (Refer to


section 5.1)

Answer 4: List of representative applications of Databases: Banking, Airlines, Universities, Credit


card transactions, Telecommunication, etc. (Refer to section 5.3 for details.)

Answer 5: A database is a structured collection of related data stored electronically in a computer


system. (Refer to section 5.3 for details.)

Answer 6: The database approach's characteristics include data independence, efficient data access,
data integrity, and support for multiple users. (Refer to section 5.3 for details.)

Answer 7: A database schema defines the structure of a database, including tables, fields, and
relationships. In contrast, a database instance is a snapshot of the data stored in the database at a
particular moment. (Refer to section 5.3 for details.)

Answer 8: Metadata refers to data about the data stored in a database, including definitions,
constraints, and relationships. (Refer to section 5.3 for details.)

Answer 9: Data manipulation involves modifying the data stored in a database through operations
like insertion, deletion, and updating records. (Refer to section 5.3 for details.)

Answer 10: A data constraint is a rule defined on a database column that restricts the type or range
of data stored in that column. (Refer to section 5.3 for details.)

Unit: 1 - Introduction to DBMS 27


DCA1208: Database Management System

11. REFERENCES

• Fundamentals of Database Systems, Ramez Elmasri and Shamkant B. Navathe, 7th edition,
Pearson, 2016, ISBN-10: 0133970779.

• Database Management Systems, Raghu Ramakrishnan and Johannes Gehrke,3rd Edition,


McGraw-Hill Education,2002 (for the 3rd Edition), ISBN-13: 978-0072465631

• Database System Concepts, Abraham Silberschatz, Henry F. Korth, S. Sudarshan,7th Edition,


McGraw-Hill Education, 2019, ISBN-10: 0073523321

Unit: 1 - Introduction to DBMS 28


BACHELOR OF COMPUTER
DCA1208: Database Management System

APPLICATION
SEMESTER 2

DCA1208
DATABASE MANAGEMENT SYSTEM
Unit: 2 - Database Systems 1
DCA1208: Database Management System

Unit - 2
Database Systems

DCA324
KNOWLEDGE MANAGEMENT
Unit: 2 - Database Systems 2
DCA1208: Database Management System
TABLE OF CONTENTS

SL Fig No / Table SAQ /


Topic Page No
No / Graph Activity

1 Introduction - -
4-5
1.1 Objectives - -

2 Types of Database Systems 1 -


2.1 Hierarchical Databases 2 -
2.2 Network Databases 3 -
2.3 Relational Database Management 4 -
System 6-19
2.4 Centralised Database Systems 5 -
2.5 Distributed Databases 6 -
2.6 Object-Oriented Databases 7 -
2.7 NoSQL Database Systems 8 -

3 Advantages and Disadvantages of DBMS - - 20-21

4 Summary - - 22

5 Glossary - - 23

6 Self-Assessment Questions - 1 24-26

7 Terminal Questions - - 27

8 Answers - -
8.1 Self-Assessment Questions - - 28-30
8.2 Terminal Questions - -

9 References - - 31

Unit: 2 - Database Systems 3


DCA1208: Database Management System

1. INTRODUCTION
In the previous unit, we learned that a database is a structured data collection organised for efficient
retrieval and management. It is a centralised repository where information is stored, organised, and
accessed according to defined relationships. The database approach emphasises structured data
storage, ensuring data integrity, security, and efficient querying.

Database Management System (DBMS) is software that facilitates database creation, manipulation,
and administration. It ensures data consistency, provides data security, and supports concurrent
access by multiple users. DBMS is crucial in modern information systems because it manages data
efficiently, improves data sharing, and supports decision-making processes.

Key components of DBMS include data definition, manipulation, and control. It comprises tables,
queries, forms, and reports to interact with data. Important concepts in DBMS include ACID
properties for transaction reliability and normalisation for data integrity.

DBMS finds application in various domains, from business operations (ERP, CRM) to scientific
research, enabling efficient data management, analysis, and retrieval for informed decision-making.

In this unit, we will understand the various types of database systems, which is crucial for gaining a
comprehensive knowledge of how data is organised, stored, and managed across different
applications and industries.

Learners can grasp the fundamental differences in data models, query languages, scalability, and use
cases by studying relational database management systems (RDBMS), NoSQL databases, object-
oriented databases, and others.

We will also be able to identify each type's strengths and weaknesses, recognise scenarios where one
type may be more suitable than others, and evaluate the impact of choosing a particular database
system on application performance and data integrity.

We will also explore the advantages and disadvantages of database management systems (DBMS),
equipping learners with a critical understanding of centralised data management's benefits and
potential challenges. Conversely, we will explore the drawbacks, including complexity in database
design, the potential for performance bottlenecks, and scalability limitations in traditional DBMS.

Unit: 2 - Database Systems 4


DCA1208: Database Management System

Moreover, we will assess the role of transaction management, concurrency control, and backup and
recovery mechanisms in ensuring data consistency and availability within a DBMS framework.

1.1. Objectives
• Identify the Types of Database Systems
• List the Advantages and Disadvantages of DBMS
• Outline the applications for the different types of
Database systems

Unit: 2 - Database Systems 5


DCA1208: Database Management System

2. TYPES OF DATABASE SYSTEMS


Database systems are fundamental in storing, organising, and managing data efficiently. There are
several types of database systems, each designed to meet different needs and requirements. RDBMS
are known for their ACID (Atomicity, Consistency, Isolation, Durability) properties, which ensure
reliable transactions.

Types of
Database
Systems

Centralised
Hierarchical Network Relational Distributed Object-Oriented NoSQL Database
Database
Databases Databases Databases Databases Databases Systems
Systems

Fig.1 Types of Database Systems

2.1. Hierarchical Databases


Hierarchical databases are a type of database management system where data is organised in a tree-
like structure, following a hierarchy of parent and child records. In this model, each record has a single
parent, but can have multiple children, creating a one-to-many relationship between the data points.
This structure is reminiscent of a family tree or an organisational chart, where the root represents
the starting point, and branches represent the connections to other records.

Unit: 2 - Database Systems 6


DCA1208: Database Management System

Fig.2 Hierarchical database systems

In the given diagram, Electronics serves as the root node and has two child nodes, namely Televisions
and Portable Electronics. These two individuals have more children for whom they serve as parents.
For instance, television is represented by several types such as tube, LCD, and plasma, and these three
types can be considered as the parent forms of television. It exhibits a one-to-many relationship.

Let us look at the Key Characteristics:


• Tree-Like Structure: The data is organised hierarchically with a single root and multiple levels
of children. This structure makes it easy to understand and navigate for specific types of queries.
• Parent-Child Relationships: Each child record is linked to one parent, but a parent can have
multiple children. This relationship model is straightforward but can be limiting if the data
requires more complex many-to-many relationships.
• Navigational Access: Data retrieval in hierarchical databases often relies on predefined paths.
You need to navigate through the hierarchy, often using pointers or references, to access specific
data points.
• Fast Data Access: Due to the structured nature of the hierarchy, accessing records that follow
the parent-child path is efficient. This can be particularly advantageous for applications that
require frequent access to hierarchical data, like organisational structures or file systems.
• Rigid Structure: Hierarchical databases are less flexible than other types, such as relational
databases. If the structure needs to change, significant modifications may be required, making it
less adaptable to evolving data requirements.
• Data Redundancy: Since each child has only one parent, data that needs to be related to
multiple entities may need to be duplicated, leading to redundancy and potential inconsistency.

Let us have a look at a few Use Cases:


• Early Mainframe Systems: Hierarchical databases were widely used in early mainframe
systems, where the simplicity and performance of the model were prioritised.
• Legacy Applications: Many legacy applications, especially those that manage data in a
structured, hierarchical manner (like bill of materials, organisational charts, and directory
services), still use hierarchical databases.

Unit: 2 - Database Systems 7


DCA1208: Database Management System

A few Examples include:


• IBM Information Management System (IMS): Developed by IBM in the 1960s, IMS is one of
the most well-known hierarchical database management systems. It was initially created for the
Apollo space program to manage large volumes of data and has since been widely adopted in
industries like banking and telecommunications.
• Windows Registry: While not a traditional database, the Windows Registry is a hierarchical
database that stores configuration settings and options for Microsoft Windows operating
systems.

2.2. Network databases, a database management system, organise data using a network
model that allows each record to have multiple parent and child records, forming complex
relationships. Unlike hierarchical databases with a strict tree-like structure, network databases
permit nodes to have multiple parent nodes, facilitating more flexible and interconnected data
relationships.

Network databases were popular in the early days of database management

systems for their ability to effectively model complex data relationships.

However, relational databases have largely superseded them due to their complexity and difficulty
managing data integrity. Despite this, network databases still find niche applications where their
unique structure is advantageous, such as in certain engineering and telecommunications systems.

The Characteristics of network databases include:


• Data Structure: Organises data in a graph-like structure where nodes (records) can have
multiple parent and child nodes, allowing many-to-many relationships.
• Record Types: Supports different types of records with complex relationships defined through
pointers or links.
• Hierarchical Flexibility: More flexible than hierarchical databases, as nodes can have multiple
parents (owners) and children, enabling more complex and interconnected data models.
• Schema Flexibility: Offers schema flexibility by allowing dynamic changes and evolution of data
relationships without the rigid structure of hierarchical databases.
• Navigational Access: Utilises navigational access methods where data access involves
traversing links or pointers between related records.

Unit: 2 - Database Systems 8


DCA1208: Database Management System

• Concurrency Control: Manages concurrent access to data through locking mechanisms to


maintain data integrity.
• Indexing: Supports indexing for efficient traversal and retrieval of records within the network
structure.

Fig.3 Network databases

A few examples include:


• Integrated Data Store (IDS): This network database management system (DBMS) is known for
its flexibility in handling complex data relationships and supporting various data types. It's used
in applications requiring extensive data modelling capabilities.
• Raima Database Manager: is a database management system that supports network and
relational data models. It provides tools for managing complex data structures and optimising
performance in embedded systems and IoT applications.

Some applications include:


1. Telecommunications Networks: Network databases are well-suited for managing complex
relationships between various elements in telecommunications networks. They can store
information about network nodes, connections, and configurations, facilitating efficient network
management and troubleshooting.

2. Engineering and CAD/CAM Systems: In engineering applications where complex data


relationships are prevalent (e.g., parts assemblies, bill of materials), network databases can
efficiently model these relationships. CAD/CAM systems use network databases to manage
design data, parts libraries, and engineering specifications.

Unit: 2 - Database Systems 9


DCA1208: Database Management System

3. Scientific Research: Research domains such as biology, chemistry, and physics often deal with
highly interconnected data, such as gene interactions, chemical compounds, and experimental
results. Network databases can store and retrieve this data, supporting complex queries and
data analysis.

4. Knowledge Management Systems: Network databases may benefit systems that organise and
manage knowledge bases, such as libraries, archives, and digital repositories. They can handle
relationships between different types of knowledge entities, facilitating semantic queries and
information retrieval.

5. Hierarchical Data Structures: Applications with hierarchical or tree-like data structures,


where entities can have multiple parents or children, can utilise network databases effectively.
This includes organisational charts, project management systems, and directory structures.

2.3. A Relational Database Management System (RDBMS) is a type of

database management system that stores and provides access to data points related to one another.
RDBMSs use a structure that allows users to identify and access data with another piece of data in the
database. This type of system is based on the relational model, which organises data into tables (also
known as relations) consisting of rows and columns. Each table represents a collection of related data
entries, and each row in a table represents a unique record with a set of attributes defined by the
columns.

Fig.3 Relational Database Management Systems

Unit: 2 - Database Systems 10


DCA1208: Database Management System

Let us see the Characteristics:


• Structured Data Storage: Data is organised in tables (relations) containing rows and columns.
Each row represents a record, and each column represents a field within the record.
• Schema-Based: RDBMS requires a predefined schema, a formal structure defining the tables,
fields, data types, and relationships between tables.
• ACID Properties: Ensures Atomicity, Consistency, Isolation, and Durability for transactions,
making it reliable for critical applications.
• SQL (Structured Query Language): SQL is used for querying and managing the database. It
allows for complex queries, data manipulation, and data definition.
• Normalisation: Supports data normalisation to eliminate redundancy and improve data
integrity.
• Foreign Keys and Relationships: Enforces data integrity and relationships between tables
through foreign keys.
• Indexing: Supports indexing to improve query performance by allowing faster retrieval of
records.
• Concurrency Control: Manages concurrent access to data by multiple users, ensuring data
consistency and isolation.
• Backup and Recovery: Provides data backup and recovery mechanisms to protect against data
loss.

A Few Examples include:


• MySQL: An open-source RDBMS known for its speed, reliability, and ease of use. It is widely used
in web applications and is the default database system for many web hosting services.
• PostgreSQL: An advanced open-source RDBMS supporting advanced features such as custom
data types, full-text search, and geographic information systems (GIS). Known for its robustness
and extensibility.
• Oracle Database: A powerful and highly scalable commercial RDBMS widely used in enterprise
environments. It offers advanced features like partitioning, clustering, and advanced security.
• Microsoft SQL Server: A commercial RDBMS developed by Microsoft, known for its integration
with other Microsoft products and services. It is widely used in business applications and
enterprise environments.

Unit: 2 - Database Systems 11


DCA1208: Database Management System

Some of the key applications include:


• Transactional Systems: RDBMSs are crucial for systems that require reliable transaction
processing, such as banking systems, e-commerce platforms, and online reservation systems.
They ensure ACID (Atomicity, Consistency, Isolation, Durability) properties, essential for
maintaining data integrity during transactions.
• Enterprise Applications: Many large-scale enterprise applications rely on RDBMSs to store and
manage critical business data. This includes ERP (Enterprise Resource Planning) systems, CRM
(Customer Relationship Management) systems, and HR management systems, where data
consistency and reliability are paramount.
• Data Warehousing: RDBMSs are used in data warehousing environments to support analytical
processing and decision-making. They facilitate data extraction, transformation, and loading
(ETL) from multiple sources into a central data warehouse, where it can be queried and analysed
efficiently.
• Content Management: Content management systems (CMS) and document management
systems often use RDBMSs to store and retrieve structured content, such as articles, documents,
images, and multimedia files. The relational model's flexibility and querying capabilities are
beneficial for organising and accessing content based on various criteria.
• Financial Services: RDBMSs are extensively used in financial services for managing financial
transactions, customer accounts, and regulatory reporting. They support complex queries and
reporting requirements, ensuring data accuracy and compliance with financial regulations.

2.4. Centralised database systems provide a straightforward and reliable approach to


managing data in environments where data integration, consistency, and security are paramount.
They refer to a database architecture where all data is stored and managed in a single location or
server. However, they may face challenges related to scalability and dependency on a single point of
failure (the centralised server).

Unit: 2 - Database Systems 12


DCA1208: Database Management System

Fig.5 Centralised database systems

The Characteristics of Centralised database systems are as follows:


• Single Point of Control: All data management activities, including storage, retrieval, and
maintenance, are centralised and managed from a single location or server.
• Data Consistency: Ensures data consistency across all applications and users accessing the
centralised database, as there is only one authoritative data source.
• Simplified Management: It is easier to manage and administer than distributed or
decentralised architectures, as there is no need to coordinate data synchronisation across
multiple locations or servers.
• Data Security: Centralised security measures can be implemented more effectively to protect
data integrity and prevent unauthorised access.
• Scalability Challenges: A single server handling all data operations and requests may pose
scalability limitations, which can become a bottleneck as data volume and user load increase.
• Network Dependency: Data access depends on network availability and latency, as all requests
must go through the centralised server.

Examples
• Small to Medium Enterprises (SMEs): Many small businesses use centralised database
systems to simplify operations and lower operational costs.
• Legacy Systems: Older systems and applications designed before distributed architectures
often use centralised database systems.

Unit: 2 - Database Systems 13


DCA1208: Database Management System

2.5. Distributed databases are systems in which data is stored across multiple physical

locations, potentially spanning different geographical areas. This setup improves data availability,
reliability, and performance by decentralising the data storage process. Each node in a distributed
database can independently manage and update its subset of the data, ensuring consistency and
synchronisation through various protocols and algorithms.

Fig.6 Distributed databases

Let us see the Characteristics:


• Geographical Distribution: Data is distributed across multiple physical locations, often
spanning different regions or continents.
• Replication and Partitioning: This technique uses techniques like data replication (copying
data to multiple nodes) and data partitioning (dividing data into subsets) to ensure high
availability and scalability.
• Fault Tolerance: It is designed to withstand failures of individual nodes or entire data centres
without compromising data integrity or availability.
• Consistency Models: Offers various consistency models (e.g., eventual consistency, strong
consistency) depending on the application's requirements and the distributed nature of the
database.
• Concurrency Control: Manages concurrent access to distributed data to maintain consistency
and prevent conflicts.
• Data Locality: Optimises data access by storing data closer to where it is needed, reducing
latency and improving performance.
• Global Transactions: This feature supports distributed transactions that span multiple nodes
or regions, ensuring atomicity and isolation across the distributed system.

Unit: 2 - Database Systems 14


DCA1208: Database Management System

• Scalability: Scales horizontally by adding more nodes to the distributed system, allowing it to
handle increasing volumes of data and requests.

A few Examples of Distributed databases:


• Google Spanner: A globally distributed, horizontally scalable RDBMS developed by Google. It
offers strong consistency and ACID transactions across multiple regions, making it suitable for
global applications requiring high availability and data integrity.
• Amazon Aurora: A MySQL and PostgreSQL-compatible relational database service offered by
Amazon Web Services (AWS). It provides high availability and performance by replicating data
across multiple Availability Zones within a region and supports automatic scaling to handle
varying workloads.

The Key applications include:


1. Global Enterprises: Large multinational corporations often deploy distributed databases to
manage their operations across regions or countries. This allows them to maintain local data
centres for compliance with regional data privacy laws while ensuring global access to
consistent and up-to-date information.
2. E-commerce and Retail: Online retailers and e-commerce platforms use distributed databases
to handle massive volumes of transactional data, customer interactions, and inventory
management. By distributing data closer to customers, they reduce latency and improve
responsiveness, enhancing user experience.
3. Telecommunications: Telecom companies rely on distributed databases to manage subscriber
data, call records, billing information, and network configurations across distributed locations.
This ensures fast data access, supports real-time processing of network events, and enables fault
tolerance for uninterrupted service.
4. Banking and Finance: Financial institutions leverage distributed databases for secure and
efficient management of customer accounts, transactions, loans, and compliance data.
Distributed architectures enable disaster recovery, ensuring continuous availability of financial
services.
5. Healthcare: Distributed databases in healthcare facilitate the storage and retrieval of patient
records, medical histories, diagnostic results, and treatment plans across multiple hospitals,
clinics, and research facilities. This enables healthcare providers to collaborate, share
information securely, and deliver timely patient care.

Unit: 2 - Database Systems 15


DCA1208: Database Management System

6. IoT (Internet of Things): IoT applications generate vast amounts of data from interconnected
devices and sensors. Distributed databases support IoT platforms by collecting, storing, and
analysing real-time data streams from distributed endpoints, enabling insights for predictive
maintenance, smart city applications, and industrial automation.

2.6. Object-oriented databases (OODBs) represent a database management system


(DBMS) designed to store and manage data as objects rather than traditional rows and columns. In
OODBs, data is encapsulated within objects containing attributes (data fields) and methods
(procedures associated with the object). This approach allows for a more natural representation of
complex data structures, supporting inheritance, encapsulation, and polymorphism—key principles
of object-oriented programming. OODBs are well-suited for applications with inherently complex
data structures, such as engineering simulations, CAD/CAM systems, and multimedia databases.

The Characteristics of OODB include:


• Object Storage: Stores data as objects, like how objects are defined in object-oriented
programming (OOP). Objects encapsulate data (attributes or fields) and behaviour (methods or
functions) into a single entity.
• Complex Data Structures: Supports complex data structures and relationships, including
inheritance, polymorphism, and encapsulation.
• Schema Flexibility: Provides flexibility in data schema, allowing objects to evolve
independently without affecting the overall database structure.
• Querying: Supports object-oriented query languages or extensions to SQL that allow for
navigating object relationships and querying object attributes directly.
• Persistence: Maintains object identity and relationships persistently, allowing objects to be
stored, retrieved, and manipulated over time.
• Programming Language Integration: Often integrates closely with programming languages
that support OOP principles, enabling seamless interaction between application code and
database objects.
• Concurrency Control: Handles concurrent access to objects, ensuring data integrity and
consistency in multi-user environments.

Unit: 2 - Database Systems 16


DCA1208: Database Management System

Manag Employe
er e

Fig.7 object-oriented databases

A few Examples include:


• db4o: An open-source object-oriented database for Java and .NET platforms. It stores objects
natively without requiring object-relational mapping (ORM) techniques, providing fast object
access and persistence.
• ObjectDB: A commercial object-oriented database management system (OODBMS) for Java
applications. It supports JPA (Java Persistence API) standards and offers high performance and
scalability for complex object models.

Some key applications include:


1. Engineering and CAD/CAM Systems: OODBs are well-suited for managing complex
engineering data, such as CAD/CAM (Computer-Aided Design/Computer-Aided Manufacturing)
systems. These systems deal with intricate geometric models, assemblies, and product
structures that benefit from object-oriented modelling, inheritance, and polymorphism.

2. Multimedia Databases: Multimedia applications, including digital libraries, multimedia


content management systems, and entertainment databases, leverage OODBs to store and
retrieve diverse multimedia data types (e.g., images, videos, audio) along with associated
metadata and relationships.

3. Telecommunications and Network Management: OODBs are used in telecommunications to


manage network configurations, performance data, and service provisioning. Network

Unit: 2 - Database Systems 17


DCA1208: Database Management System

management systems benefit from OODBs' ability to model complex network topologies and
relationships among network elements.

4. Scientific and Research Applications: Scientific databases, such as those used in


bioinformatics, astronomy, and environmental sciences, often require storing and querying
complex data structures. OODBs support the representation of hierarchical and interconnected
data entities in these domains.

5. E-commerce and Personalisation Systems: E-commerce platforms and recommendation


engines utilise OODBs to manage customer profiles, product catalogues, transaction histories,
and personalised recommendations. OODBs can efficiently model relationships between users,
products, and preferences.

2.7. NoSQL (Not only SQL) database systems are a diverse category of databases
that depart from the traditional relational database management systems (RDBMS). They are
designed to address the limitations of RDBMS and cater to modern application requirements,
particularly those involving large-scale data storage and real-time web applications.

NoSQL Database Table

Disk 1 Disk 2 Disk 3

Fig.8 NoSQL database systems

Here are the characteristics:


• Schema Flexibility: NoSQL databases are schema-less or schema-flexible, allowing for dynamic
and unstructured data models. They can handle semi-structured and polymorphic data more
effectively than RDBMS.
• Horizontal Scalability: Designed to scale horizontally across multiple servers or nodes,
allowing them to handle large volumes of data and high transaction rates efficiently.
• High Availability: Many NoSQL databases are designed with built-in replication and
partitioning mechanisms, ensuring data availability and fault tolerance even during hardware
failures or network partitions.

Unit: 2 - Database Systems 18


DCA1208: Database Management System

• Types of NoSQL Databases:


a) Document Stores: Store data in flexible, JSON-like documents. Examples: MongoDB, CouchDB.
b) Key-Value Stores: Store data as key-value pairs. Examples: Redis, DynamoDB.
c) Column Family Stores: Store data in columns rather than rows. Examples: Cassandra, HBase.
d) Graph Databases: Optimise for data with complex relationships using nodes, edges, and
properties. Examples: Neo4j, Amazon Neptune.
• CAP Theorem: NoSQL databases are often designed with different trade-offs in consistency,
availability, and partition tolerance (CAP theorem), allowing developers to choose databases
that best fit their application's needs.

A Few examples include:


• MongoDB: A widely used document-oriented NoSQL database. It stores data in flexible JSON-
like documents, making it suitable for various applications, including content management, real-
time analytics, and mobile apps.
• Redis: A popular key-value store known for its high performance and flexibility. It is often used
for caching, real-time analytics, messaging queues, and session management.
• Cassandra: A column family store designed for scalability and high availability. It manages large
amounts of structured data across multiple commodity servers.
• Neo4j: A leading graph database for managing highly connected data, such as social networks,
recommendation engines, and network and IT operations.

Use Cases include:


• Big Data and Real-Time Analytics: NoSQL databases excel in handling large volumes of data
and supporting real-time data processing and analytics, making them suitable for applications
such as IoT data management, sensor data analysis, and log aggregation.
• Content Management and Personalisation: Document-oriented and key-value NoSQL
databases efficiently store and retrieve unstructured or semi-structured content, supporting
content management systems, blogs, and e-commerce platforms.
• Highly Scalable Applications: NoSQL databases, with their ability to scale horizontally across
distributed clusters, are ideal for applications that require rapid growth and handling of
increasing data volumes, such as social networks, online gaming, and web applications.

Unit: 2 - Database Systems 19


DCA1208: Database Management System

3. ADVANTAGES AND DISADVANTAGES OF DBMS


Advantages and Disadvantages of DBMS

One of the main advantages of using a database system is that the organisation can exert centralised
management and control over the data through the DBA.

The centralised control is focused on the database administrator.

Any application that necessitates a change in the structure of a data record must make prior
arrangements with the DBA, who will make the appropriate changes.

The following are some of the most significant advantages of DBMS:

1. Sharing Data: A database permits data sharing under the control of any number of users or
application programs.
2. Data Redundancy Reduction: Centralised data control by the DBA avoids undesirable data
duplication and effectively reduces the total data storage required.
a. It also reduces the need for further processing to find the relevant data in a huge amount of
data. Another benefit of minimising duplication is eliminating common discrepancies in
redundant data files. Any DBMS redundancies are managed, and the system assures
numerous copies of the same data are consistent.
3. Data Integrity: Centralised control can also ensure the DBMS incorporates sufficient checks to
facilitate data integrity. When discussing data integrity, we're referring to the data's accuracy.

The database's information is both accurate and consistent. As a result, data values entered for
storage could be double-checked to ensure they are within a specific range and in the correct format.

For example, consider the value for the age of an employee may be in the range of 16 and 75.
Another integrity check that should be incorporated into the database is to ensure that if there is a
reference to a certain object, that objectmust exist. In the case of an automatic teller machine, for
example, a user isnot allowed to transfer funds from a non-existent savings account to anexisting one.

4. Data Security: Data is vital for any organisation and may be confidential. Such confidential data
must be protected from unauthorised persons. As the owner of the data in the DBMS, the DBA
can ensure that suitable access protocols are followed, such as proper authentication schemas
for DBMS access and additional checks before granting access to sensitive data.

Unit: 2 - Database Systems 20


DCA1208: Database Management System

Multiple levels of security could be established for different types of data and procedures. Security
enforcement could be data value-dependent (e.g., a manager has access to the salary details of
employees in his or her department only) and data-type-dependent (e.g., the manager cannot access
the medical history ofany employees, including those in his/her department).

5. Conflict Resolution: Since the database is under the DBA's control, she/ he should resolve the
conflicting requirements of various users and applications. In other words, the DBA selects the
appropriate file format and access mechanism for response-critical applications while enabling
less important apps to continue using the database with a delayed response time.

The Disadvantages are as follows:


1. Complexity and Cost: Implementing and maintaining a DBMS can be complex and expensive. It
requires specialised knowledge and expertise to effectively design, deploy, and manage a
database system. Additionally, licensing fees for commercial DBMS software can be costly.
2. Security Concerns: Centralising data in a database can make it a prime target for security
breaches. A single DBMS or misconfiguration vulnerability can lead to unauthorised access, data
leaks, or data manipulation. Protecting against security threats requires robust security
measures and constant monitoring.
3. Data Integrity Risks: Database systems must ensure the integrity of their data, meaning that
data must be accurate, consistent, and reliable. However, data corruption, loss, or
inconsistencies can occur due to software bugs, hardware failures, or human error.
4. Complexity of Backup and Recovery: Managing backups and ensuring data recovery in case
of failures can be complex, especially for large databases. Implementing reliable backup and
recovery strategies is essential to minimise the risk of data loss and ensure business continuity.
5. Resource Consumption: DBMSs consume significant system resources, including CPU, memory,
and storage. As databases grow larger and more complex, resource requirements can escalate,
potentially impacting other applications or services running on the same infrastructure.

Unit: 2 - Database Systems 21


DCA1208: Database Management System

4. SUMMARY
• Database systems vary widely based on structure, functionality, and intended use. Relational
Database Management Systems (RDBMS) store data in tables with rows and columns, offering
strong consistency and SQL-based querying. NoSQL databases, including document stores (like
MongoDB) and key-value stores (like Redis), provide unstructured and distributed data
flexibility.
• In-memory databases like Redis optimise performance by storing data in RAM. Distributed
databases such as Google Spanner ensure high availability and scalability across multiple
locations. Object-oriented databases store data as objects, facilitating complex relationships.
Each type caters to specific needs, balancing scalability, flexibility, and performance.
• DBMS offers several advantages, including centralised data management, ensuring data
integrity through ACID properties, and supporting concurrent access and data security. It
facilitates efficient data retrieval and manipulation, improving productivity and decision-
making.
• However, DBMS requires skilled management, may incur high initial costs for setup and
maintenance, and can be vulnerable to security breaches. It may also face scalability limitations,
particularly in distributed environments. Choosing the right DBMS involves balancing these
factors to meet specific application requirements while considering long-term scalability and
operational efficiency.

Unit: 2 - Database Systems 22


DCA1208: Database Management System

5. GLOSSARY

A Database Management System (DBMS) is software that facilitates the

DBMS - creation, maintenance, and use of databases, providing users and


applications access to stored data.

ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties

ACID - that ensure database transactions are processed reliably, even in the
presence of failures.

Customer Relationship Management (CRM) refers to practices, strategies,


and technologies used to manage and analyse customer interactions and
CRM - data throughout the customer lifecycle to improve relationships and drive

sales growth.

Relational Database Management System (RDBMS) organises data into

RDBMS - structured tables (relations) with rows and columns, enforcing


relationships and ensuring data integrity through constraints.

Structured Query Language (SQL) is a domain-specific language used for

SQL - managing and manipulating relational databases. It allows users to query,


insert, update, and delete data.

Normalisation is organising data in a database to reduce redundancy and

Normalisation - improve data integrity, typically by breaking down large tables into
smaller, related tables.

MongoDB is a NoSQL document-oriented database that stores data in

MongoDB - flexible JSON-like documents. It offers scalability and high performance


for handling unstructured and semi-structured data.

A Database Administrator (DBA) is an IT professional responsible for

DBA - managing and maintaining databases, ensuring their security,


performance, availability, and integrity.

Unit: 2 - Database Systems 23


DCA1208: Database Management System

6. SELF-ASSESSMENT QUESTIONS

SELF-ASSESSMENT QUESTIONS – 1
Fill in the blanks:
1 What type of data storage does a Relational Database Management System (RDBMS) use?
a) Key-value pairs
b) Documents
c) Tables (relations)
d) Objects
2 Which property of RDBMS ensures reliability by guaranteeing transactions are processed
reliably?
a) ACID
b) BASE
c) CAP
d) JSON
3 Which database system is known for its geographical distribution across multiple physical
locations?
a) Object-Oriented Database
b) Relational Database Management System (RDBMS)
c) Distributed Database
d) Network Database
4 Which characteristic ensures fault tolerance in Distributed Databases?
a) ACID properties
b) Replication and Partitioning
c) Foreign Keys
d) Normalisation
5 Object-oriented databases store data as:
a) Key-value pairs
b) Documents
c) Tables (relations)
d) Objects

Unit: 2 - Database Systems 24


DCA1208: Database Management System

6 Which feature of Object-Oriented Databases supports complex data relationships like object-
oriented programming?
a) ACID Properties
b) Querying
c) Persistence
d) Inheritance and Polymorphism
7 Network Databases allow nodes (records) to have:
a) Multiple children only
b) Multiple parents only
c) Both multiple parents and children
d) No relationships
8 Centralised Database Systems are managed from:
a) Multiple servers
b) A single location or server
c) A cloud environment
d) Geographically distributed locations
9 NoSQL databases are primarily designed to address the limitations of:
a) Object-Oriented Databases
b) Centralised Databases
c) Relational Database Management Systems (RDBMS)
d) Network Databases
10 Which characteristics of NoSQL databases allow for efficient handling of large volumes of data
and high transaction rates?
a) ACID Properties
b) Horizontal Scalability
c) SQL Support
d) Schema Flexibility
11 Which type of NoSQL database stores data in flexible, JSON-like documents?
a) Key-Value Stores
b) Document Stores
c) Column Family Stores
d) Graph Databases

Unit: 2 - Database Systems 25


DCA1208: Database Management System

12 Which advantage of DBMS reduces redundancy and improves data integrity?


a) Data Sharing
b) Data Security
c) Data Redundancy Reduction
d) Conflict Resolution
13 What are the disadvantages of DBMS in terms of complexity and cost?
a) Data Integrity Risks
b) Security Concerns
c) Resource Consumption
d) Centralised Control
14 Which challenge is specific to Centralised Database Systems?
a) Complexity of Backup and Recovery
b) Security Concerns
c) Scalability Challenges
d) Data Consistency
15 Which characteristic of NoSQL databases ensures data availability during hardware failures?
a) ACID Properties
b) High Availability
c) Schema Flexibility
d) Data Normalisation

Unit: 2 - Database Systems 26


DCA1208: Database Management System

7. TERMINAL QUESTIONS
1. Define DBMS
2. Classify the types of Database systems
3. List the characteristics of RDMS
4. Identify a few examples of Distributed databases.
5. Write a short note on Object-oriented databases.
6. Differentiate Network databases vs. Centralised database systems.
7. Explain NOSQL database systems
8. List a few advantages of DBMS
9. Outline a few disadvantages of DBMS.
10. List the applications of Distributed databases.

Unit: 2 - Database Systems 27


DCA1208: Database Management System

8. ANSWERS

8.1. Self-Assessment answers


1. c) Tables (relations)
2. a) ACID
3. c) Distributed Database
4. b) Replication and Partitioning
5. d) Objects
6. d) Inheritance and Polymorphism
7. c) Both multiple parents and children
8. b) A single location or server
9. c) Relational Database Management Systems (RDBMS)
10. b) Horizontal Scalability
11. b) Document Stores
12. c) Data Redundancy Reduction
13. a) Data Integrity Risks
14. c) Scalability Challenges
15. b) High Availability

8.2. Terminal Answers


Answer 1: DBMS (Database Management System) manages databases, allowing users to store,
retrieve, and manipulate data efficiently while ensuring data integrity and security. Refer to section
1 for more details

Answer2: Database systems are classified into several types:


o Relational Database Management Systems (RDBMS)
o Distributed Databases
o Object-Oriented Databases
o Network Databases
o Centralised Database Systems
o NoSQL Database Systems Refer to section 2 for more details

Unit: 2 - Database Systems 28


DCA1208: Database Management System

Answer 3: Characteristics of RDBMS include:


a) Data stored in tables (relations)
b) Structured data storage with rows and columns
c) Schema-based with predefined data structures
d) ACID properties for transaction reliability
e) SQL for querying and managing data
f) Support for data normalisation and indexing. Refer to section 2.1 for more details

Answer 4: Examples of Distributed databases include Google Spanner, Amazon Aurora, Cassandra,
and MongoDB Atlas. Refer to section 2.2 for more details

Answer 5: Object-oriented databases store data as objects, encapsulating attributes and methods,
supporting complex data relationships and object-oriented programming principles like inheritance
and polymorphism. Refer to section 2.3 for more details

Answer 6: Network databases allow nodes (records) to have multiple parent and child nodes,
facilitating complex relationships. In contrast, centralised database systems store all data in a single
location, offering centralised control but potentially facing scalability challenges. Refer to section 2.4
for more details

Answer 7: NoSQL database systems depart from traditional relational databases by offering schema
flexibility, horizontal scalability, and high availability. They include document stores, key-value
stores, column family stores, and graph databases to handle diverse data types and large-scale
applications. Refer to section 2.6 for more details

Answer 8: Advantages of DBMS include data sharing, reduced data redundancy, data integrity
enforcement, centralised data security, efficient data management, and conflict resolution in multi-
user environments. Refer to section 3 for more details

Answer 9: Disadvantages of DBMS include complexity and cost of implementation, potential security
vulnerabilities, data integrity issues risks, backup and recovery processes complexity, and resource
consumption. Refer to section 3 for more details

Answer 10: Applications of Distributed Databases include E-commerce, handling large-scale


transactions and customer data across geographically dispersed locations to ensure scalability and
availability. Global Enterprises: Supporting multinational organisations with data distribution across

Unit: 2 - Database Systems 29


DCA1208: Database Management System

regions for local access, disaster recovery, and compliance with data privacy laws. Refer to section
2.2 for more details

Unit: 2 - Database Systems 30


DCA1208: Database Management System

9. REFERENCES
• Fundamentals of Database Systems, Ramez Elmasri and Shamkant B. Navathe, 7th edition,
Pearson, 2016, ISBN-10: 0133970779.

• Database Management Systems, Raghu Ramakrishnan and Johannes Gehrke,3rd Edition,


McGraw-Hill Education,2002 (for the 3rd Edition), ISBN-13: 978-0072465631

• Database System Concepts, Abraham Silberschatz, Henry F. Korth, S. Sudarshan,7th Edition,


McGraw-Hill Education, 2019, ISBN-10: 0073523321

Unit: 2 - Database Systems 31


BACHELOR OF COMPUTER
DCA1208: Database Management System

APPLICATION
SEMESTER 2

DCA1208
DATABASE MANAGEMENT SYSTEM
Unit: 3 – Data Models 1
DCA1208: Database Management System

Unit - 3
Data Models

DCA324
KNOWLEDGE MANAGEMENT
Unit: 3 – Data Models 2
DCA1208: Database Management System
TABLE OF CONTENTS

SL Fig No / Table SAQ /


Topic Page No
No / Graph Activity

1 Introduction - -
4-5
1.1 Objectives - -

2 Concept of Data Models - - 6

3 Types of Data Models 1 -


3.1 Relational Data Model 2 -
3.2 Hierarchical Model 3 -
3.3 Network Data Model 4 - 7-17
3.4 Object/Relational Model 5 -
3.5 Object-Oriented Model 6 -
3.6 Entity-Relationship Model 7 -

4 Overview of the Relational Model 8 - 18-23

5 Summary - - 24-25

6 Glossary - - 26-27

7 Self-Assessment Questions - 1 28-29

8 Terminal Questions - - 30

9 Answers - -
9.1 Self-Assessment Questions - - 31-33
9.2 Terminal Questions - -

10 References - - 34

Unit: 3 – Data Models 3


DCA1208: Database Management System

1. INTRODUCTION
The previous unit deals with the database systems crucial for managing and organising data. There
are several types of database systems, each serving specific needs. Relational Database Management
Systems (RDBMS) store data in tables with rows and columns, facilitating easy querying and data
integrity.

We also understood how Distributed Databases spread data across multiple locations, enhancing
availability and reliability. Object-oriented databases store data as objects, supporting complex data
types and relationships. Network Databases utilise a graph structure to model relationships, allowing
for more flexible data associations. Centralised Database Systems store all data in a single location,
simplifying management but posing risks of a single point of failure. NoSQL Database Systems cater
to unstructured data and large-scale applications, offering high scalability and flexibility.

We also saw the advantages of a Database Management System, including improved data consistency,
security, and efficient data management. However, DBMS also have disadvantages, such as complexity,
cost, and the need for regular maintenance. Understanding the strengths and limitations of each
database system type helps select the right one for specific organisational needs.

This unit focuses on a data model, which is an integrated collection of concepts for describing data,
identifying relationships between data, and determining constraints on the data. A data model allows
us to treat a database as an abstract machine. In other words, we can concentrate on the principles
of design divorced from an immediate concern with implementation. Data models constitute formal
languages for defining data structures, declaring integrity, and manipulating data. A data model is a
mechanism for specifying the schema of some database.

We will also see how the data models establish the principles underlying DBMS. Every database, and
indeed every DBMS, must adhere to the principles of some data model. However, the term data model
is somewhat ambiguous. In the database literature, the term is used in several different senses, two
of which are the most important: that of architecture for data and that of an integrated set of data
requirements.

Unit: 3 – Data Models 4


DCA1208: Database Management System

1.1. Objectives
After completing Unit 3, you should be able to:
• Define a Data Model.
• List the different types of Data Model.
• Differentiate between Hierarchical Model vs.
Network Model.
• Outline the overview of the Relational model.

Unit: 3 – Data Models 5


DCA1208: Database Management System

2. CONCEPT OF DATA MODEL


The term data model refers to a set of general principles for handling data. Here, people talk of the
relational data model, the hierarchical data model, or the object-oriented data model.

This set of principles that define a data model may be divided into three major parts:
• Data definition – a set of principles concerned with data structure.
• Data manipulation – a set of principles concerned with how data isoperated upon.
• Data integrity – a set of principles concerned with determining which states are valid for a
database.

Data definition involves defining an organisation for data, a set of templates for the organisation of
data. Data manipulation concerns how the data is accessed and changed in the database. Data
integrity is very much linked with data manipulation in the sense that integrity concerns valid
changes and invalid changes todata.

Unit: 3 – Data Models 6


DCA1208: Database Management System

3. TYPES OF DATA MODEL


In Database Management Systems (DBMS), a data model defines the logical structure of a database,
including the relationships and constraints that determine how data can be stored, organised, and
manipulated.

We may make a distinction between three generations of architectural data models:


• Primitive data models: Primitive data models represent objects using basic record structures
organised within file structures. This approach primarily supports simple operations, such as
reading and writing records, and does not inherently provide advanced features for data
manipulation or complex relationships. Data is typically stored sequentially, and interactions
with the data are limited to basic operations, making this model suitable for straightforward
data storage and retrieval scenarios.
• Classic data models: Classic data models include hierarchical, network, and relational models,
each representing a different approach to organising and managing data. The hierarchical data
model builds on the primitive model by introducing a tree-like structure where records are
arranged in parent-child relationships. The network model extends this further by allowing
more complex, many-to-many relationships between records. The relational data model marks
a significant departure from these earlier models by organising data into tables (relations) and
using set theory principles, offering a more flexible and powerful framework for data
management and querying.
• Semantic data models: Semantic data models address limitations in classic data models, such
as the relational model, which often maintains a rigid, record-oriented structure. They aim to
represent data in a way that captures its meaning and relationships more effectively. By focusing
on the semantics of data—its meaning and context—, these models provide a richer framework
for understanding and interacting with data, supporting more complex queries and enabling
better alignment with real-world concepts and relationships.

Unit: 3 – Data Models 7


DCA1208: Database Management System

We can classify Data model types as follows:

Data Model Types

Relational Data Network Data Object/Relational Object-Oriented


Hierarchical Model
Model Model Model Model

Fig1. Types of Data Models

3.1. Relational Data Model


Today, the relational model is the primary data model for commercial data-processing applications.
It has attained this position because of its simplicity, which eases the programmer's job compared to
earlier data models such as the network or hierarchical models.

A database based on the relational model developed by E.F. Codd allows the definition of data
structures, storage and retrieval operations, and integrity constraints. The data and their relations
are organised in tables in such a database. A table is a collection of records containing the same fields.

G
Primary
key

Tuples
(Rows)

Columns
(Attribute)

Fig 2. Relational data model

Unit: 3 – Data Models 8


DCA1208: Database Management System

Each row in the relation represents a student, indicating that the row in the table describes a real-
world item. The columns of the table represent the attributes associated with the entity. For this
scenario, the information required includes the student's roll number, CGPA, and name.

This database consists of various components based on the relational model. These include:
• Relation: A two-dimensional table is used to store a collection of data elements.
• Tuple: Row of the relation, depicting a real-world entity.
• Attribute/Field: Column of the relation depicting properties that define the relation.
• Attribute Domain: A set of pre-defined atomic values that an attribute can take, i.e., it describes
the legal values that an attribute can take.
• Degree: It is the total number of attributes present in the relation.

Properties of Relational Tables:


• Values Are Atomic: In a relational table, each value in a column is atomic, meaning it is
indivisible. This property ensures that each cell in the table contains a single value, not a set or
list of values. Atomic values simplify data manipulation and querying.
• Each Row is Unique: Each row in a relational table must be unique. This is often enforced using
a primary key, a unique identifier for each row. Ensuring row uniqueness prevents duplicate
records and maintains data integrity.
• Column Values Are of the Same Kind: All values in a single column must be of the same data
type or kind. For example, if a column is designated for storing dates, all entries must be dates.
This consistency allows for meaningful data operations and comparisons within the column.
• The Sequence of Columns is Insignificant: The order in which columns appear in a relational
table does not affect the data or its interpretation. Operations on the table, such as queries, are
independent of column order, allowing flexibility in data organisation and presentation.
• The Sequence of Rows is Insignificant: Similarly, the order of rows in a relational table does
not matter. Rows can be retrieved, inserted, or deleted without regard to their sequence. This
property allows for efficient data management and retrieval based on criteria rather than
position.
• Each Column has a unique Name: Each column in a relational table must have a unique name.
This uniqueness ensures that each attribute can be distinctly identified and referenced in
queries and data manipulation operations. Column names serve as clear and specific labels for
the data they contain.

Unit: 3 – Data Models 9


DCA1208: Database Management System

3.2. Hierarchical Model


The hierarchical data model organises data in a tree structure. There is a hierarchy of parent and child
data segments. This structure implies that a record can have repeating information, generally in the
child data segments.Data is in a series of records, which has a set of field values attached to it. Itcollects
a specific record's instances together as a record type. These record types are the equivalent of tables
in the relational model, with the individual records being the equivalent of rows. The hierarchical
model uses Parent-Child Relationships to create links between these record types. These are a 1:N
mapping between record types. For example, an organisation might store information about an
employee, such as name, employee number, department, and salary. The organisation might also
store information about an employee's children, such as name and date of birth. The employee and
children data form a hierarchy, where the employee data represents the parent segment, and the
children data represents the child segment. If an employee has three children, then there would be
three child segments associated with one employee segment. In a hierarchical database, the parent-
child relationship is one too many. This restricts a child segment to having only one parent segment.
Hierarchical DBMSs were popular from the late 1960s, with the introduction of IBM's Information
Management System (IMS) DBMS through the 1970s.

Fig3. Hierarchical data model

In the diagram shown, electronics are the root node, with two subordinate nodes, televisions and
portable electronics. These two individuals have more children for whom they serve as parents. For
instance, television is represented by several types, such as tube, LCD, and plasma, and these three
types serve as the parents of television. It exhibits a one-to-many relationship.

Unit: 3 – Data Models 10


DCA1208: Database Management System

3.3. Network Data Model


The popularity of the network data model coincided with the popularity of thehierarchical data model.
Some data were more naturally modelled with morethan one parent per child. So, the network model
permitted the modelling of many-to-many relationships in data. In 1971, the Conference on Data
Systems Languages (CODASYL) formally defined the network model. The primary data modelling
construct in the network model is the set construct. A set consists of an owner record type, a set name,
and a member record type. A member record type can have that role in more than one set; hence, the
multi-parent concept is supported. An owner record type can also be a member or owner in another
set. The data model is a simple network; link and intersection record types may exist and be set
between them. Thus, the complete network of relationships is represented by several pair-wise sets;
in each set, some (one) record type is the owner (at the tail of the network arrow), and one or more
record types are members (at the head of the relationship arrow). Usually, a set defines a 1:M
relationship, although 1:1 is permitted. The CODASYL network model is based on mathematical set
theory.

Fig4. Network data model

Let's consider creating the network model for the student database. The Subject entity shows a
relationship with both the student and Degree entities. There is a connection between the Subject
entity and both the Student and Degree entities. The Subject entity has two parents, while the
remaining two entities have one child entity.

Unit: 3 – Data Models 11


DCA1208: Database Management System

3.4. Object- Relational Model


Object-relational database management systems (ORDBMSs) add new object storage capabilities to
the relational systems at the core of modern information systems. These new facilities integrate the
management of traditional fielded data, complex objects such as time series and geospatial data, and
diverse binary media such as audio, video, images, and applets.By encapsulating methods with data
structures, the ORDBMS server can execute complex analytical and data manipulation operations to
search for and transform multimedia and other complex objects.

As an evolutionary technology, the object/relational (OR) approach has inherited its relational
ancestor's robust transaction and performance-management features and the flexibility of its object-
oriented cousin.

Database designers can work with familiar tabular structures and data definition languages (DDLs)
while assimilating new object-management possibilities. Query and procedural languages and call
interfaces in ORDBMSs are familiar: SQL3, vendor procedural languages, ODBC, JDBC, and proprietary
call interfaces are all extensions of RDBMS languages and interfaces. The leading vendors are well-
known: IBM, Informix, and Oracle.

Let us consider an example:

Fig 5. Object- Relational Model

Unit: 3 – Data Models 12


DCA1208: Database Management System

The Figure represents an example of the Object-Relational Model (ORM), which blends the relational
database approach with object-oriented principles. It illustrates three interconnected tables:
"Customers," "Addresses," and "BankAccounts." The Customers table contains primary information
about customers (e.g., ID, surname, and first name). The Addresses table associates’ customers with
multiple locations using a foreign key (CID) that links to the ID in the Customers table, showcasing a
one-to-many relationship. Similarly, the BankAccounts table links multiple accounts to a single
customer using the same foreign key (CID). This structure effectively models complex real-world
entities, demonstrating how ORM encapsulates related data into hierarchical relationships, enabling
efficient data management and retrieval.

3.5. Object-Oriented Model


Object DBMSs add database functionality to object programming languages. They bring much more
than persistent storage of programming language objects. Object DBMSs extend the semantics of the
C++, Smalltalk, and Java object programming languages to provide full-featured database
programming capability while retaining native language compatibility. A significant benefit of this
approach is the application and database development unification into a seamless data model and
language environment. As a result, applications require less code, use morenatural data modelling,
and code bases are easier to maintain. Object developers can write complete database applications
with a modest amountof additional effort.

The object-oriented database (OODB) paradigm combines object-oriented programming language


(OOPL) systems with persistent systems. OODB's power comes from the seamless treatment of both
persistent data, as found in databases, and transient data, as found in executing programs.

In contrast to a relational DBMS, where a complex data structure must be flattened out to fit into
tables or joined together from those tables to form thein-memory structure, object DBMSs have no
performance overhead to store or retrieve a web or hierarchy of interrelated objects. This one-to-one
mapping of object programming language objects to database objects has two benefits over other
storage approaches: it provides higher performance management of objects, and it enables better
management of the complex interrelationships between objects. This makes object DBMSs better
suited to support applications such as financial portfolio risk analysis systems, telecommunications
service applications, WWW (World Wide Web) document structures, design and manufacturing
systems, and hospital patient record systems, which have complex relationships between data.

Unit: 3 – Data Models 13


DCA1208: Database Management System

Fig 6. Object-Oriented Model

As you can see, student and Department are two distinct objects. Every individual has unique qualities
and techniques. A relationship between the objects is established via their shared attribute Dept_no.

Components of the Object-Oriented Data Model, namely objects, classes, object attributes, class
hierarchy, etc., are explained as follows-
• Object—It is a physical or real-world entity. The object can also be called the ER model entity.
The object defines a single instance of an entity, which is known as an 'instance of a class. '
• Object attribute- The objects contain specific attributes. These are referred to as the object's
attributes.
• Object method- The object's behaviour is shown using object methods.
• Class- Certain characteristics are present in the objects. These characteristics are referred to as
the object's attributes.
• Inheritance—It is the ability of an object to inherit the attributes and methods of the classes
above it within the class hierarchy. A new class can be derived from an existing class. The new
class has the attributes and methods described in the existing class and its own attributes and
methods. This helps in code reusability.
• Encapsulation refers to the bundling of data (attributes) and the methods (operations or
functions) that manipulate the data into a single unit, known as an object. In this model, data is
not accessed directly; instead, interactions with the data occur through well-defined interfaces,
typically in the form of methods.

Unit: 3 – Data Models 14


DCA1208: Database Management System

This encapsulation principle ensures that the object's internal representation is hidden from the
outside world, providing a layer of abstraction that safeguards the integrity of the data. For instance,
if an object represents a bank account, users can interact with the account balance only through
specific methods like deposit or withdrawal, preventing unauthorised or inappropriate access to the
underlying data.

3.6. Entity-Relationship Model


The Entity-Relationship (ER) Model is a fundamental tool in database design. It allows designers to
create a clear, organised, and precise conceptual blueprint of a database, ensuring that the final
implementation meets the users' requirements and supports the intended applications. The ER
model's visual and intuitive nature makes it an essential part of the database development process,
aiding in both design and communication.

Attribute Attribute
Relations
hip

Attribute Entity A R Entity B Attribute

Attribute Attribute

Fig 7. E-R Model

An ER diagram in DBMS defines entities, their corresponding properties, and their interconnections.
This helps represent the database's logical structure as shown above.

Key Features:
• Entities represent objects or concepts: Entities are fundamental units in a database that
represent real-world objects or concepts. Each entity is a distinct object or concept that can be
uniquely identified.

For example, In a student database, Students, courses, and professors could each be represented as
entities.

In a retail database, Customers, products, and orders might each be entities.

Unit: 3 – Data Models 15


DCA1208: Database Management System

Entities typically correspond to tables in a relational database, where each row represents a specific
instance of the entity, and the columns represent the entity’s attributes.

Relationships describe associations between entities. Attributes describe properties of entities.

Relationships Describe Associations Between Entities


Relationships illustrate how entities are connected to each other. They represent the logical
associations and interactions between different entities in a database.

For example, A relationship might exist between students and courses in a student database,
indicating which students are enrolled in which courses.

In a retail database, A relationship could link customers to orders, showing which customers made
which purchases.

Relationships can be classified into different types:


• One-to-One (1:1): Each instance of one entity is associated with a single instance of another
entity. For example, each employee has one unique ID card.
• One-to-Many (1): An instance of one entity can be associated with multiple instances of another
entity. For example, one professor can teach many courses.
• Many-to-Many (M): Multiple instances of one entity can be associated with multiple instances of
another entity. For example, students can enrol in many courses, each with many students.

Attributes Describe Properties of Entities:


Attributes are the data points or properties that describe the characteristics of entities. Each attribute
corresponds to a column in a table, and it holds specific data values for each instance (row) of the
entity.

For example, In a student entity, Attributes might include StudentID, Name, DateOfBirth, and Major.

Attributes in a product entity could include ProductID, ProductName, Price, and StockQuantity.
Attributes provide detailed information about each entity, making it possible to store and retrieve
specific data values. They are essential for defining a table’s schema and ensuring each entity instance
is fully described.

Bringing It All Together: In a well-designed database, entities, relationships, and attributes work
together to form a coherent data model. This model represents the structure of the data, its meaning,

Unit: 3 – Data Models 16


DCA1208: Database Management System

and how different pieces of data relate to each other. By clearly defining these components, databases
can efficiently store, manage, and retrieve large amounts of structured information.

Unit: 3 – Data Models 17


DCA1208: Database Management System

4. OVERVIEW OF RELATIONAL MODEL


Today, the relational model is the primary data model for commercial data-processing applications.
It has attained this position because of its simplicity, which eases the programmer's job compared to
earlier data models such as the network or hierarchical models.

A database based on the relational model developed by E.F. Codd allows the definition of data
structures, storage and retrieval operations, and integrity constraints. The data and their relations
are organised in tables in such a database. A table is a collection of records; each record in a table
contains the same fields.

Properties of Relational Tables:


• Values Are Atomic: In a relational table, the values in each column are atomic, meaning they
are indivisible. Each value in a column is a single, discrete piece of data, which helps maintain
data integrity and simplifies data manipulation. This atomicity ensures that data is consistent
and easily retrieved and managed.
• Each Row is Unique: Every row in a relational table is unique. This uniqueness is often enforced
by using primary keys, unique identifiers for each row. Ensuring that each row is unique
prevents data duplication and helps maintain the integrity of the database by ensuring that each
record can be distinctly identified.
• Column Values Are of the Same Kind: All values in a column are of the same data type and
represent the same kind of data. For example, a column for storing dates will only contain date
values. This consistency in data types across columns ensures that the data is meaningful and
can be correctly processed by database operations.
• The Sequence of Columns is Insignificant: The order in which columns are arranged in a
relational table is insignificant. The data remains the same regardless of the column order. This
property allows flexibility in how data is viewed and queried, as the logical structure of the table
is independent of the physical arrangement of columns.
• The Sequence of Rows is Insignificant: The order of rows in a relational table does not matter.
Rows can be retrieved in any order based on queries, and their sequence in storage does not
affect the data's meaning. This property enables the database management system to store and
retrieve data efficiently without worrying about maintaining a specific order.

Unit: 3 – Data Models 18


DCA1208: Database Management System

• Each Column has a Unique Name: Each column in a relational table must have a unique name
to distinguish it from other columns within the same table. Unique column names help
accurately identify and reference data elements within queries and operations, ensuring clear
and unambiguous access to the data.

The relational data model represents the database as a group of related tables. The relational data
model was introduced in 1970. It iscurrently the most popular model. The mathematical simplicity
and ease of visualisation of the relational data model have contributed to its success. The relational
data model is based on the mathematics of set theory, whosebasic components are the following.

Relation: A two-dimensional table. A relation is a collection of tuples containing values for a fixed
number of attributes. Relations are sometimes called at-files because they resemble an unstructured
sequence of records. Each tuple in a relation must be unique; there can be no duplicates.

A Relation may be defined in multiple ways. The Relation Schema R, denoted by R (A1, A2, An),
comprises a relation name R and is a list of attributes A1, A2, An.

For Example - CUSTOMER (Cust-id, Cust-name, Address, Phone#). Here, CUSTOMER is a relation
defined over the four attributes Cust-id, Cust-name, Address, and Phone#, each of which has a
domain or a set of valid values.

A tuple is an ordered set of values derived from an appropriate domain. Each row in the CUSTOMER
table may be referred to as a tuple and consists of four values.

<1759, "Rama Krishna", "101 Main 3rd Cross Manipal", "0820-2653487">is a tuple belonging to
the CUSTOMER relation.

A relation may be regarded as a set of tuples (rows). Columns in a table are also called attributes of
the relation.

A domain has a logical definition; for example, “India_Phone_numbers” is a set of 10-digit mobile
numbers. A domain may have a data type or a format defined for it. The India_Phone_numbers may
have a format: (ddd)- ddddddd where each d is a decimal digit. For example, dates can be in various
formats such as a month, name, date, year, or yyyy-mm-dd, or dd, mm,yyyy, etc.

An attribute designates the role played by the domain. E.g., the domainDate may be used to define
attributes “Invoice-date” and “Payment-date”. The relation is formed over the cartesian product of

Unit: 3 – Data Models 19


DCA1208: Database Management System

the sets; each set has values from a domain, and that domain is used in a specific role, which is
conveyed by the attribute name. For example, attribute Cust-name is defined over the domain of
strings of 25 characters. The role these strings play in the CUSTOMER relation is that of the name of
the customers.

Given R(A1, A2,….., An)

r(R) Ì dom (A1) X dom (A2) X X dom(An)

R is a relation schema, and r(R) is a specific "value" or population of R. Here, R is also called the
intention of a relation, and r is also called the extension of a relation.

Let S1 and S2 be domains, S1 = {0,1} and S2 = {a,b,c}. The Relation R canbe written as

R Ì S1 X S2

r(R) = {<0,a> , <0,b> , <1,c> }

is one possible “state,” “population”, or “extension” r(R), defined over domains S1 and S2. It has three
tuples.

Characteristics of Relations
• Ordering of tuples in a relation r(R): The tuples are not considered to be ordered, even though
they appear in the tabular form.
• Ordering of attributes in a relation schema R (and of values within each tuple): We will
consider the attributes in R(A1, A2, ..., An) and the values in t=<v1, v2, ..., vn> to be ordered.
• Values in a tuple: All values are considered atomic (indivisible). A special null value represents
unknown or inapplicable values to certain tuples.

Relational Integrity Constraints


Constraints are conditions that must hold on t o all valid relation instances.There are three main
types of constraints:
– Key constraints
– Entity integrity constraints
– Referential integrity constraints

Unit: 3 – Data Models 20


DCA1208: Database Management System

Key Constraints
Superkey of R: A set of attributes SK of R such that no two tuples in any valid relation instance r(R)
will have the same value for SK. That is, for any distinct tuples t1 and t2 in r(R), t1[SK] ≠ t2[SK].

Key of R: A "minimal" superkey; that is, a superkey K such that removal of any attribute from K results
in a set of attributes that is not a superkey.

For Example, Consider a CAR relation schema:


CAR(State, Reg#, SerialNo, Make, Model, Year)

It has two keys Key1 = {State, Reg#}, Key2 = {SerialNo}, which are also superkeys. {SerialNo,
Make} is a superkey but not a key.

If a relation has several candidate keys, one is chosen arbitrarily as theprimary key. The primary
key attributes are underlined.

Entity Integrity
Relational Database Schema: A set S of relation schemas that belong tothe same database. S is the
name of the database.

S = {R1, R2, ..., Rn}

Entity Integrity: The primary key attributes PK of each relation schemaR in S cannot have null
values in any tuple of r(R). This is because primary key values are used to identify the individual tuples.

t[PK] ¹ null for any tuple t in r(R)

Referential Integrity
A constraint involving two relations (the previous constraints involve a single relation). Used to
specify a relationship among tuples in two relations, the referencing relation and the referenced
relation.

Tuples in the referencing relation R1 have attributes FK (foreign key attributes) that reference the
primary key attributes PK of the referenced relation R2. A tuple t1 in R1 is said to reference a tuple t2
in R2 if t1[FK] = t2[PK].

A referential integrity constraint can be displayed as a directed arc from R1 in a relational database
schema.FK to R2.

Unit: 3 – Data Models 21


DCA1208: Database Management System

Referential Integrity Constraints


The value in the foreign key column (or columns) FK of the referencing relation R1 can be either a
value of an existing primary key value of the corresponding primary key PK in the referenced relation
R2 or a null.

Attribute: A table column. Other commonly used terms for attributes are property and field. The set
of permissible values for each attribute is called the domain for that attribute.

Tuple: A table row. A tuple is an instance of an entity or relationship or whatever is represented by


the relation.

Key: A single attribute or combination of attributes whose values uniquely identify the tuples of the
relation. Each row has a different value for the key attribute(s). The relational model requires that
every relation have a key and that:
• No two tuples may have the same key value and
• Every tuple must have a value for the key attribute (the key fields have non-null values).

There are two restrictions on the relational model that are sometimes circumvented in practice:
• Duplicate tuples are not permitted. If two tuples are entered with the same value for every
attribute, they are considered the same tuple.
• No ordering of tuples within a relation is assumed. In practice, however, one method of ordering
tuples is often used.

One of the main advantages of the relational model is that it is conceptually simple and, more
importantly, based on the mathematical theory of relation. It also frees the users from details of the
storage structure and access methods. The relational model, like all other models, consists of three
basic components:
• a set of domains
• a set of relations operation on relations
• integrity rules

In general, we say that a relation defined over n domains has a degree n or is n-ary. The elements of
this set are n-tuples. We shall distinguish between the definition of a relation and the relation itself.
We shall say that the definition of a relation gives a name to the relation and specifies the components
over which it is defined. These components are referred to as relation attributes or attributes for

Unit: 3 – Data Models 22


DCA1208: Database Management System

short. An attribute has a domain associated with it from which it takes on values. The relation itself,
on the other hand, is the set of tuples which constitute it at a given instance of time.

For example, a statement that says that a relation supplier is built over attributes S#, P#, and SCITY
having domains integer and character string, respectively, is the definition of the relation supplier.
The relation itself is shown below. It must be noted at the time the definition of a relation is just given,
a relation with no tuples in it, i.e. a null relation, is just given, a relationwith no tuples in it, i.e. a null
relation, is created.

Supplier

S# P# SCITY

10 1 BANGALORE

10 2 BANGALORE

10 3 BANGALORE

11 1 BOMBAY

11 2 BOMBAY

Fig.8 Supplier Table

Unit: 3 – Data Models 23


DCA1208: Database Management System

5. SUMMARY
Let us summarise what we have studied in this unit:
Data models are fundamental frameworks used to structure and organise data within databases. They
provide a systematic approach to defining how data is stored, accessed, and manipulated. These
models serve as blueprints for constructing databases, guiding the database schema's design and
influencing the database system's integrity, performance, and scalability. Data models ensure that
data is logically organised and that relationships among data elements are clearly defined, facilitating
efficient data management and retrieval.

There are several types of data models, each with unique characteristics and applications. The
primary types include the Relational Data Model, Hierarchical Model, Network Data Model,
Object/Relational Model, Object-Oriented Model, and Entity-Relationship Model. Each model offers a
different perspective on how data can be structured and interconnected, catering to various use cases
and requirements. Understanding these models is essential for selecting the appropriate database
design that aligns with specific business needs and data processing demands.

The Relational Data Model, proposed by E.F. Codd, is the most widely used in database management
systems. It organises data into tables (relations) consisting of rows (tuples) and columns (attributes).
Each table represents a specific entity; relationships between tables are established through foreign
keys. This model emphasises data integrity and supports powerful querying capabilities using
Structured Query Language (SQL). Its simplicity, flexibility, and support for normalisation make it a
popular choice for various applications.

The Hierarchical Model arranges data in a tree-like structure with parent-child relationships, where
each child can have only one parent. This model efficiently represents one-to-many relationships but
can be rigid and complex for many-to-many relationships. On the other hand, the Network Data Model
allows more complex relationships by organising data as a graph with nodes and edges. It supports
many-to-many relationships and is more flexible than the hierarchical model but can be more
challenging to design and maintain.

The Object/Relational Model extends the relational model by incorporating object-oriented concepts,
allowing complex data types and object identity. The Object-Oriented Model treats data as objects,
like object-oriented programming, enabling inheritance, encapsulation, and polymorphism. This

Unit: 3 – Data Models 24


DCA1208: Database Management System

model is suitable for applications requiring complex data representations and behaviours. The Entity-
Relationship (ER) Model visually represents data entities and their relationships, using ER diagrams
to illustrate the database structure. It provides a high-level conceptual view, aiding database design
and stakeholder communication.

Last is the relational model, which is a cornerstone of modern database systems, emphasising data
organisation in tables with well-defined relationships. It relies on principles such as normalisation to
reduce data redundancy and ensure data integrity. SQL is the primary language used to interact with
relational databases, enabling efficient data manipulation, querying, and transaction management.
The model's strength lies in its simplicity, robustness, and scalability, making it suitable for a wide
range of applications, from small-scale systems to large enterprise databases.

Unit: 3 – Data Models 25


DCA1208: Database Management System

6. GLOSSARY

A data model defines the structure of a database and determines how data
is stored, organised, and manipulated. It includes the data types,
Data Model - relationships, and constraints that apply to the data. Examples include

hierarchical, network, relational, and object-oriented models.

The relational model organises data into tables (or relations) consisting
of rows and columns. Each table represents a different entity;

Relational Model - relationships between tables are established through foreign keys. This
model emphasises data integrity and allows for powerful query
capabilities using SQL.

CODASYL (Conference on Data Systems Languages) developed the


network data model, which organises data using a graph structure with

CODASYL - nodes (records) and edges (relationships). It supports complex many-to-


many relationships. CODASYL's approach influenced early database
systems and standards for data management.

ORDBMS (Object- An ORDBMS combines the features of relational and object-oriented


Relational databases. It supports complex data types, inheritance, and encapsulation,
Database - allowing for the storage and manipulation of objects as well as traditional
Management relational data. Examples include PostgreSQL and Oracle.
System)
An object-oriented database stores data as objects, as used in object-

OODB (Object- oriented programming. It supports features like inheritance,

Oriented - polymorphism, and encapsulation. This model is well-suited for


Database) applications that require complex data representations, such as CAD/CAM
systems and multimedia databases.

ER Model (Entity- The ER model is a conceptual framework for representing data


Relationship - relationships within a database. It uses entities (objects), attributes
Model) (properties), and relationships to map the structure of data. This model is

Unit: 3 – Data Models 26


DCA1208: Database Management System

often used in database design to create a visual schema before


implementation.

Integrity constraints are rules that ensure the accuracy and consistency of

Integrity data within a database. They include primary keys, foreign keys, unique
- constraints, and check constraints. These constraints help maintain data
Constraints
quality and enforce business rules.

An attribute is a property or characteristic of an entity in a database. It

Attribute - corresponds to a column in a relational database table. Attributes define


the data elements stored within a record, such as name, age, or address.

A tuple is a single row or record in a relational database table. It

Tuple - represents a single instance of an entity containing values for each


attribute. Tuples are fundamental units of data in the relational model.

Unit: 3 – Data Models 27


DCA1208: Database Management System

7. SELF-ASSESSMENT QUESTIONS

SELF-ASSESSMENT QUESTIONS – 1
Fill in the blanks:
1 ________ is a set of principles concerned with determining whichstates are valid for a
database.
2 In the Primitive data models approach, objects are represented by ___________structures
grouped in file structures.
3 In the __________data model, the database is represented as a group of related tables.
4 Each tuple in a relation must be__________________; that is, there can be no duplicates.
5 A tuple is an ________________set of values.
6 In relational integrity constraints, there are main types of constraints.
7 If a relation has several candidate keys, one is chosen arbitrarily to be the ________________key.
8 The set of permissible values for each attribute is called the ______________for that attribute.
9 A single attribute or combination of attributes whose values uniquely identify the ____________of
the relation.
10 The hierarchical data model organises data in a ________________ structure.
11 In a hierarchical database, the parent-child relationship is _______________.
12 The popularity of the network data model coincided with the popularity of the data model.
13 Object DBMSs add database functionality to ___________________programming languages.
14 A major benefit of the Object-Oriented data model approach is the _____________of the
application.
15 Which of the following is a property of relational tables?
a) Values are non-atomic
b) Each row is unique
c) The sequence of rows is significant
d) Columns can have duplicate names
16 Who developed the relational model for databases?
a) Charles Bachman
b) Edgar F. Codd
c) Peter Chen

Unit: 3 – Data Models 28


DCA1208: Database Management System

d) Larry Ellison
17 In a relational database, what term describes a table?
a) Schema
b) Attribute
c) Relation
d) Domain
18 What is the primary key used in a relational table?
a) To store null values
b) To enforce referential integrity
c) To uniquely identify each tuple
d) To define the data type of each column
19 Which of the following is a characteristic of a superkey in a relational database?
a) It allows duplicate values for key attributes.
b) It ensures no two tuples have the same value for the key attributes.
c) It permits null values for primary key attributes.
d) It does not guarantee the unique identification of tuples.

Unit: 3 – Data Models 29


DCA1208: Database Management System

8. TERMINAL QUESTIONS
1. Define a Data Model.
2. Classify the principles of the Data model.
3. List the types of data Model.
4. Explain the generations of architectural data models.
5. What is a Relational Model?
6. Compare & contrast the Hierarchical Model vs. the Network Model.
7. Outline the properties of the Object/Relational database management system.
8. Explain the Entity-relationship model.
9. Describe a relational model.
10. Explain relational schema
11. List the characteristics of relations.
12. What are Relational Integrity Constraints?
13. Differentiate between Entity Integrity vs. Referential Integrity.
14. Define a Tuple.
15. Distinguish between an attribute and a Key.

Unit: 3 – Data Models 30


DCA1208: Database Management System

9. ANSWERS

9.1. Self-Assessment Answers


1. Data integrity
2. Record
3. Relational
4. Unique
5. Ordered
6. Three
7. Primary
8. Domain
9. Tuples
10. Tree
11. one to many
12. hierarchical
13. object
14. unification
15. b) Each row is unique
16. b) Edgar F. Codd
17. c) Relation
18. c) To uniquely identify each tuple
19. b) It ensures no two tuples have the same value for the key attributes.

9.2. Terminal Answers


Answer 1: A data model is a conceptual framework that defines the structure, organisation, and data
manipulation in a database. It includes the data types, relationships, and constraints. Data models
help design databases and ensure data consistency. Refer to section 2 for more details.

Answer 2: The principles of data models include data abstraction, data independence, and data
integrity. Data abstraction simplifies database design, data independence ensures changes do not
affect applications, and data integrity maintains accuracy and consistency. Refer to section 2 for
more details.

Unit: 3 – Data Models 31


DCA1208: Database Management System

Answer 3: Data models include hierarchical, network, relational, object-oriented, and entity-
relationship (ER) models. Each type has unique characteristics and is suited to different database
design requirements. Refer to section 3 for more details.

Answer 4: The generations of architectural data models evolved from file-based systems to
hierarchical and network models, followed by the relational model and, more recently, object-
oriented and NoSQL databases. Each generation addresses the limitations of the previous ones. Refer
to section 3 for more details.

Answer 5: The relational model organises data into tables, or relations, with rows and columns. It
uses keys to establish relationships between tables and allows for powerful data querying and SQL
manipulation. It emphasises data integrity and simplicity. Refer to section 3.1 for more details.

Answer 6: The hierarchical model organises data in a tree-like structure with parent-child
relationships, while the network model uses a graph structure with nodes and interconnected
relationships. Unlike the hierarchical model, the network model supports many-to-many
relationships. Refer to section 3.2 for more details.

Answer 7: An ORDBMS supports complex data types, inheritance, and encapsulation, blending
features of relational and object-oriented databases. It allows for storing and manipulating objects
and provides robust querying capabilities through SQL. Refer to section 3.4 for more details.

Answer 8: The ER model represents data using entities, attributes, and relationships. Entities are
objects, attributes are properties of entities, and relationships connect entities. It is used to design
database schemas conceptually before physical implementation. Refer to section 3.6 for more
details.

Answer 9: The relational model structures data into tables with rows and columns, using keys to
establish relationships. It ensures data integrity through constraints and allows for flexible querying
using SQL. It is widely used for its simplicity and robustness. Refer to section 4 for more details.

Answer 10: A relational schema defines the structure of a relational database, including tables,
columns, and the relationships between them. It specifies each column's data types and constraints
and is a blueprint for organising data. Refer to section 3.6 for more details.

Unit: 3 – Data Models 32


DCA1208: Database Management System

Answer 11: Relations have unique table names, distinct rows (tuples), distinct column names
(attributes), atomic attribute values, and unordered rows and columns. These characteristics ensure
data consistency and integrity in a relational database. Refer to section 4 for more details.

Answer 112: Relational integrity constraints ensure data accuracy and consistency in a relational
database. They include primary keys, foreign keys, unique constraints, and check constraints, helping
to enforce business rules and maintain data quality. Refer to section 4 for more details.

Answer 13: Entity integrity ensures that each table has a unique primary key and that no part of the
primary key is null. Referential integrity ensures that foreign keys correctly reference primary keys
in related tables, maintaining consistent relationships between tables. Refer to section 3.6 for more
details.

Answer 14: A tuple is a row or record in a relational database table representing an instance of an
entity. It contains specific values for each attribute defined in the table schema, forming a complete
set of related data. Refer to section 3.6 for more details.

Answer 15: An attribute is a column in a database table representing a data field, such as name or
age. A key is a specific attribute or set of attributes used to uniquely identify rows in a table, such as
primary and foreign keys. Refer to section 3.6 for more details.

Unit: 3 – Data Models 33


DCA1208: Database Management System

10. REFERENCES

1. Fundamentals of Database Systems, Ramez Elmasri and Shamkant B. Navathe, 7th edition,
Pearson, 2016, ISBN-10: 0133970779.

2. Database Management Systems, Raghu Ramakrishnan and Johannes Gehrke,3rd Edition,


McGraw-Hill Education,2002 (for the 3rd Edition), ISBN-13: 978-0072465631

3. Database System Concepts, Abraham Silberschatz, Henry F. Korth, S. Sudarshan,7th Edition,


McGraw-Hill Education, 2019, ISBN-10: 0073523321

Unit: 3 – Data Models 34


BACHELOR OF COMPUTER
DCA1208: Database Management System

APPLICATION
SEMESTER 2

DCA1208
DATABASE MANAGEMENT SYSTEM
Unit: 4 – Introduction to the ER Model 1
DCA1208: Database Management System

Unit - 4
Introduction to the ER Model

DCA324
KNOWLEDGE MANAGEMENT
Unit: 4 – Introduction to the ER Model 2
DCA1208: Database Management System
TABLE OF CONTENTS

SL Fig No / Table SAQ /


Topic Page No
No / Graph Activity

1 Introduction - -
4-5
1.1 Objectives - -
Concept of Entity-Relationship Model 1, 2, 3, 4, 5, 6,
2 7, 8, 9, 10, 11, -
12, 13, 14
2.1 Symbols Used in ER Model - -
2.2 Entities and Attributes 6-23
- -
2.3 Relationships - -
2.4 Keys - -

3 Summary - - 24-25

4 Glossary - - 26

5 Self-Assessment Questions - 1 27-31

6 Terminal Questions - - 32

7 Answers - -
7.1 Self-Assessment Questions - - 33-37
7.2 Terminal Questions - -

8 References - - 38

Unit: 4 – Introduction to the ER Model 3


DCA1208: Database Management System

1. INTRODUCTION
The previous unit focused on Data models, which are frameworks for structuring and organising data.
They define how data is stored, organised, and manipulated. There are several types of data models,
each with unique characteristics. First, we saw the Relational Data Model, which Organises data into
tables (relations) with rows and columns. Each row represents a record, and each column represents
an attribute. It is widely used due to its simplicity and flexibility. Second is the Hierarchical Model:
This model structured the data in a tree-like format, where each child node has a single parent. It is
efficient for hierarchical data but lacks flexibility for more complex relationships. Third is the
Network Data Model, which is similar to the hierarchical model but allows multiple parent nodes to
form a graph structure. It is more flexible than the hierarchical model but can be complex to manage.

Fourth, we studied the Object/Relational Model, which extends the relational model by integrating
object-oriented concepts, allowing the storage of complex data types and structures. Fifth is the
object-oriented model, which represents data as objects, like object-oriented programming. This
model is beneficial for applications requiring complex data representations. Sixth was the Entity-
Relationship Model, which uses entities, attributes, and relationships to represent data. It is primarily
used for database design and conceptual data modelling. An overview of the relational model
emphasises using tables to represent data, focusing on data integrity and using keys to establish
relationships between tables.

In this unit, we will learn that a relational database is a collection of relations with distinct relation
names. The relational database schema is the collection of schemas for the relations in the database.
For example, a university database contains schemas for relations called Students, Faculty, Courses,
Rooms, Enrolled Numbers, Teachers, etc.

We will learn that an instance of a relational database is a collection of relation instances, one per
relation schema in the database schema; of course, each relation instance must satisfy the domain
constraints in its schema.

The entity-relationship model is a tool for analysing an application's semantic features independent
of events. This approach includes a graphical notation depicting entity classes as rectangles,
relationships as diamonds, and attributes as circles or ovals. For a complex situation, a partial entity-

Unit: 4 – Introduction to the ER Model 4


DCA1208: Database Management System

relationship diagram may be used to present a summary of the entities and relationships that do not
include the details of the attributes.

Last, we will outline how the entity-relationship diagram provides a convenient method for
visualisingthe interrelationships among entities in each application and how this tool has proven
helpful in transitioning from an information application description to a formal database schema.

1.1. Objectives
After completing this Unit, you should be able to:
• Outline the Entity-Relationship Model
• Define an Entity.
• Explain an Attribute.
• Describe a Relationship.
• Classify the different types of Keys

Unit: 4 – Introduction to the ER Model 5


DCA1208: Database Management System

2. CONCEPT OF ENTITY-RELATIONSHIP MODEL


The entity-relationship diagram provides a convenient method for visualisingthe interrelationships
among entities in each application. This tool has proven helpful in transitioning from an information
application description to a formal database schema. The entity-relationship model describes an
enterprise's conceptual schema without paying attention to the efficiency of the physical database
design. The entity-relationship diagrams are then turned into a logical schema in which the database
is implemented.

The Entity-Relationship (ER) Model is a conceptual framework for data modelling and database
design. It provides a high-level view of the data structure by representing entities, objects or concepts
that can have data stored about them and the relationships between them. Each entity is
characterised by attributes that describe its properties, while relationships illustrate how entities
interact.

The basic terms that are used for describing important entity-relationship concepts are as follows:

2.1. Symbols Used in the ER Model


The Entity-Relationship (ER) model uses diverse symbols to visually depict the configuration of a
database, including its entities, properties, and relationships. The following are the primary symbols
utilised in an Entity-Relationship (ER) model:

Unit: 4 – Introduction to the ER Model 6


DCA1208: Database Management System

Fig.1 Symbols representation in the ER model

• Rectangles: Rectangles represent Entities in the ER Model.


• Ellipses: Ellipses represent Attributes in the ER Model.
• Diamond: Diamonds represent Relationships among Entities.
• Lines: Lines represent attributes to entities and entity sets with other relationship types.
• Double Ellipse: Double Ellipses represent Multi-Valued Attributes.
• Double Rectangle: Double Rectangle represents a Weak Entity.

2.2. Entities and Attributes


Entity: In the Entity-Relationship (ER) Model, an entity represents a distinct object or concept that
exists and can be uniquely identified within the database domain. Entities can be physical objects,
such as people and cars, or abstract concepts, such as departments or courses.

For example, A student, an employee, a bank a/c, etc., are all entities.

Unit: 4 – Introduction to the ER Model 7


DCA1208: Database Management System

Entity Name EMPLOYEE

Primary Key attribute ID

ENAME

Non- primary Key attribute SALARY

DEPT

Fig.2 Entity representation

In a Database Management System (DBMS), an entity has several key characteristics that define its
role and attributes within the database. These characteristics include:

1. Uniqueness: Each entity must be distinguishable from others within the database. This
uniqueness is typically achieved using a primary key, uniquely identifying each entity instance.
2. Attributes: Entities are described by attributes, which are properties or characteristics that
provide more details about the entity. For example, a "Person" entity might have attributes such
as Name, Age, and Address.
3. Entity Type: An entity is classified into an entity type or entity class, which groups similar
entities with shared attributes. For example, "Employee" or "Customer" are entities that group
entities with common attributes relevant to their respective domains.
4. Relationships: Entities can participate in relationships with other entities, which define how
entities interact with one another. Relationships establish the associations and dependencies
between different entities within the database.
5. Instance: An entity type includes multiple instances. Each instance represents a specific
occurrence of the entity type, such as a particular employee or customer, distinguished by
unique values for the entity's attributes.
6. Integrity Constraints: Entities are subject to integrity constraints to ensure data accuracy and
consistency. Constraints like primary keys, foreign keys, and unique constraints enforce rules
on how data is stored and related.
7. Identification: Each entity must be identifiable through a unique identifier or key. This key
ensures that each entity instance can be referenced and retrieved accurately from the database.

Unit: 4 – Introduction to the ER Model 8


DCA1208: Database Management System

Entity Types in DBMS:


In a Database Management System (DBMS), entity types define the categories or classes of objects
stored in a database. Each entity type represents a collection of similar entities with shared attributes.
Here are some common entity types in DBMS:

Entity Types

Strong Entity Weak Entity Derived Entity


Types Types Types

Fig 3. Types of Entities

Strong Entity Types: Also known as regular entities, these have a unique identifier or primary key
that distinguishes each instance of the entity type. They do not rely on other entities for their identity.

For example, a customer is a strong entity with a unique customer ID.

Weak Entity Types: These are entities that do not have a unique identifier of their own and rely on
a strong entity (also known as an owner entity) for identification. Weak entities are identified by a
combination of their attributes and the primary key of the strong entity they depend on.

For example, OrderItem May rely on the Order entity to uniquely identify individual items within an
order.

Derived Entity Types: These are entities whose attributes can be derived from other entities. For
example, Age Could be derived from a Person entity's Date of Birth attribute.

A simple example of a Strong and weak entity: As seen below, we have a customer as the strong entity
and address as the weak entity, with respective attributes such as Name, Customer_id, Age, Gender,
Town, Locality, and State.

Unit: 4 – Introduction to the ER Model 9


DCA1208: Database Management System

Customer id Locality

Name Age Town State

Customer has Address

Strongly Entity
Type
Gend Name Week
er Entity Type

Fig 4. Strong and Weak Entity

Entity Set:
An entity set is a collection of entities of the same type. It is essentially a grouping of all instances of a
particular entity type that share common attributes. Each entity in the set has its own unique values
for these attributes, but all entities within the set have the same attribute structure.

Key terminologies used in entity sets:


1. Attributes: Attributes are the properties or characteristics of an entity that describe the
associated data. They define the details or features stored about each instance of the entity. For
example, in a "Person" entity, attributes might include Name, Date of Birth, and Address.
2. Entity Type: An entity type, also known as an entity class, is a category or class of entities that
share the same set of attributes. It defines the structure and the common properties of the
entities that belong to it. For instance, "Employee" or "Product" are entity types where each type
groups similar entities with shared attributes.
3. Entity Instance: An entity instance refers to a specific occurrence of an entity type. Each
instance represents an individual object or record within the entity type. For example, "John
Doe" might be an instance of the "Employee" entity type, with a unique set of values for
attributes like Employee ID and Name.
4. Primary Key: A primary key is a unique identifier assigned to each entity instance within an
entity type. It ensures that each entity instance can be uniquely distinguished from others in the
entity set. For instance, an Employee ID could serve as a primary key for the "Employee" entity
type, ensuring that each employee is uniquely identifiable.

Unit: 4 – Introduction to the ER Model 10


DCA1208: Database Management System

Attributes: In a Database Management System (DBMS), an attribute is a property or characteristic of


an entity. An entity is any object or thing in the database about which data is stored. Attributes provide
details about the entity and form the columns of a table in a relational database. Each attribute has a
name and a data type, defining the data it can hold.

Let us see the key Points About Attributes:


1. Properties of Entities: Attributes describe the properties or characteristics of entities. For
example, in a school database, entities might include Students, Teachers, and Courses. Attributes
for the Student entity might include StudentID, Name, Age, and Address.
2. Column in a Table: In a relational database, an attribute corresponds to a column in a table. Each
row in the table represents a single instance of the entity, and the columns represent its attributes.
3. Data Type: Each attribute has a specific data type that defines what kind of data it can store, such
as integers, floating-point numbers, characters, strings, or dates.
4. Domain: The domain of an attribute is the set of possible values it can take. For instance, an
attribute Gender might have a domain consisting of the values {Male, Female, Other}.

Attributes describe the properties of entities and relationships.


Types of attributes:
a) Simple attributes: A simple attribute is the smallestsemantic unit of data, which is atomic (no
internal structure). An attribute that cannot be further subdivided into components is a simple
attribute.
Examples: a student's roll number, an employee's ID number, gender, and many more.

name

roll_no class

Student

Fig.5 Simple attributes

b) Composite attribute: A composite attribute can be divided into smaller sub-parts, each
representing a more basic attribute with independent meanings.

Unit: 4 – Introduction to the ER Model 11


DCA1208: Database Management System

Example: The address can be further split into house number, street number, city, state,
country, and pin code; the name can also be split into first name, middle name, and last name.

city State Street

name

roll_no address

Student

Fig.6 Composite attributes

c) Single attributes: Single attributes have a single value for a particular entity.

The attribute which takes up only a single value for each entity instance is a single-valued attribute.

Example: The age of a student, Aadhar card number.

name

roll_no DOB

Student

Fig.7 Single attributes

d) Multi-Valued Attribute: Multivalued attributes have multiple values of an attribute for a


particular entity, e.g., degrees or courses that a student can have or take.
The attribute which takes up more than a single value for each entity instance is a multi-
valued attribute. A double oval shape represents it.

Example: Phone number of a student: Landline and mobile.

Unit: 4 – Introduction to the ER Model 12


DCA1208: Database Management System

roll_no name DOB Phone_no

Student

Fig.8 Multi-Valued Attribute

e) Derived Attribute: A derived attribute's value can be derived from other attributes or entities.
Derived attributes are attributes that can be derived from other attributes. They are
represented by a dotted oval shape.
Example: Total and average marks of a student and employee age derived from date of birth.

roll_no name DOB Age

Student

Fig.9 Derived Attribute

f) Domain: The conceptual definition of attributes is a named set of scalar values, all of the same
type, and a pool of possible values.

Example: The gender attribute might have a domain of {Male, Female, Other}.

2.3. Relationships: A relationship is a connection between entities in a database.

Relationships help define how entities interact with each other, enabling the database to model real-
world scenarios effectively. For instance, consider the relationship between PERSONS and
AUTOMOBILES. This relationship could be described as "OWNS," indicating that automobiles are
owned by people.

Unit: 4 – Introduction to the ER Model 13


DCA1208: Database Management System

Examples of Relationships

Example: PERSONS and AUTOMOBILES


• Entities: PERSONS, AUTOMOBILES
• Relationship: OWNS

In this case, each person (an instance of the PERSONS entity) can own multiple automobiles
(instances of the AUTOMOBILES entity).

One person owns each automobile. This can be visualised in the database schema as:

Person ID Name

1 Raj

2 Vincent

Table.1 Relationship

Here, OwnerID in the AUTOMOBILES table is a foreign key referencing PersonID in the PERSONS table,
establishing the "OWNS" relationship.

In DBMS, relationships can be classified based on the number of entities involved. Binary and ternary
relationships are two such types:

Relationships

Binary Ternary

Unit: 4 – Introduction to the ER Model 14


DCA1208: Database Management System

Fig.10 Types of relationships

Binary Relationships: Binary relationships involve two entities. They are the most common type of
relationship in relational databases and describe how two entities are related to each other.

Types of Binary relationships are as follows:


• One-to-One
• One-to-Many
• Many-to-Many

Ternary Relationships: A ternary relationship involves three entities. It is used when a binary
relationship cannot capture the complexity of the relationship between entities. A ternary
relationship connects three entities simultaneously and is typically represented using an associative
entity or relationship in an Entity-Relationship (ER) diagram.

Example: Consider an order system where the entities are Customer, Product, and Store. A ternary
relationship is needed if we want to track which customer bought which product from which store.
In this case, the relationship involves three entities, which cannot be fully expressed using binary
relationships.

Is-a Hierarchies
Is-a hierarchies are a special type of relationship that allows for attribute inheritance. This is
fundamental in object-oriented modelling and can also be applied in database design. In an Is-a
hierarchy, a subclass inherits attributes and relationships from its superclass.

Example: Automobiles and Trucks


• Superclass: Automobile
• Subclass: Truck

In this scenario, An Automobile entity might have attributes such as Model and serial number. A Truck
is a specific type of Automobile, which means it inherits the Model and serial number attributes from
the Automobile entity.

This can be represented as:

Unit: 4 – Introduction to the ER Model 15


DCA1208: Database Management System

Table.2 Is-a Hierarchies

In the above example:


The truck inherits the Model and serial number from the Automobile.

TruckID references AutomobileID to establish the is-a relationship.

Key Points of Is-a Hierarchies:


1. Inheritance: Attributes and relationships defined in the superclass are automatically available
to the subclass.
2. Specialisation: The subclass can have additional attributes or relationships specific to itself.
3. Efficiency: This structure reduces redundancy by allowing shared attributes to be defined once
in the superclass.

2.4. Keys:
In a Database Management System (DBMS), keys are fundamental to ensuring data integrity and
enabling efficient data retrieval. Keys uniquely identify records in a table, establish relationships
between tables, and enforce various constraints. Here are the main types of keys used in DBMS. A key
is an identifier that uniquely differentiates one entity instance from all others in the entity

Importance of Keys in DBMS


• Data Integrity: Keys ensure each record is unique and identifiable, preventing duplicate records.
• Relationships: Foreign keys establish relationships between tables, enabling complex queries
and data integrity across tables.
• Efficient Data Retrieval: Indexes on keys speed up data retrieval operations, making queries
more efficient.
• Constraint Enforcement: Keys enforce various constraints (e.g., uniqueness and referential
integrity) to maintain the data's correctness and reliability.

Unit: 4 – Introduction to the ER Model 16


DCA1208: Database Management System

Types of Keys

Primary Key Candidate Key Super Key Foreign Keys Composite Key

Fig.11 Types of Keys

1. Primary Key
A primary key is a unique identifier for a record within a table. Every table is limited to having a single
primary key, which is not allowed to have null values. It guarantees that every record is distinct and
can be identified independently.

• Example: StudentID can be a primary key in a Student table.

Student ID Name Age

1 Raj 21

2 Vincent 22

Table 3. Primary Key


2. Candidate Key
A candidate key is a set of attributes that can uniquely identify a record. A table can have multiple
candidate keys, but only one can be chosen as the primary key.

• Example: In an Employee table, both EmployeeID and Email can be candidate keys.

Employee ID Email Name

1011 [email protected] Raj

1021 [email protected] Vincent

Table 4. Candidate Key

Unit: 4 – Introduction to the ER Model 17


DCA1208: Database Management System

3. Super Key
A super key is a set of one or more attributes that can uniquely identify a record in a table. It includes
candidate keys and any other combination of attributes that uniquely identifies records.

• Example: In a Customer table, both {CustomerID} and {CustomerID, Email} can be super keys.

CustomerID Email Name

1 [email protected] Raj

2 [email protected] Vincent

Table 5. Super Key


4. Foreign Key
A foreign key is an attribute or set of attributes in one table that references the primary key of another
table. It establishes a relationship between the two tables and enforces referential integrity.

• Example: In an Order table, CustomerID can be a foreign key referencing the CustomerID in
the Customer table.

Table 6. Foreign Key

5. Composite Key
A composite key is a primary key composed of two or more attributes. It is used when a single
attribute is not sufficient to uniquely identify a record.

• Example: In an Enrollment table, a combination of StudentID and CourseID can be a composite


key.

Unit: 4 – Introduction to the ER Model 18


DCA1208: Database Management System

Table 7. Composite Key

Let’s Discuss a few Examples:

STUDENT

student_i first_name last_nam email dob cours


d e e_id

1 Ankita Sharma [email protected] 2000- 101


m 01-01

2 John Smith [email protected] 2001- 102


02-02

3 Smita Brown [email protected] 2002- 103


03-03

Table 8. Student Table


• Primary Key: student_id
• Candidate Keys: student_id, email
• Composite Key (if needed): student_id and course_id

Customer Table

customer_id first_name last_name Email phone address

1 John Dev [email protected] 123-456- Mumbai


7890

2 Jane Smith [email protected] 234-567- New Delhi


8901

3 Smita Brown [email protected] 345-678- Mumbai


9012

Unit: 4 – Introduction to the ER Model 19


DCA1208: Database Management System

Table 9. Customer Table

• Primary Key: customer_id


• Candidate Keys: customer_id, email, phone
• Super Key: Any combination of columns that can uniquely identify a record, such as customer_id
or customer_id combined with email.

If you have another table, like orders, the customer_id from the customer table could serve as a
foreign key in the orders table to establish a relationship between customers and their orders.

Orders Table

order_id customer_id product_name quantity order_date

101 1 TV 1 2024-01-02

201 2 Smartphone 2 2024-02-02

301 1 Laptop 1 2024-03-02

Table 10. orders Table

Foreign Key: customer_id, which Links to the customer_id in the customer table, establishing a
relationship between customers and their orders.

An Entity-Relationship (ER) schema is a conceptual framework for modelling the data structure of a
database. It uses ER diagrams to visually represent entities, their attributes, and the relationships
between them. An ER schema may identify certain constraints to which the data content must
conform.

Cardinality Specifies the number of instances of one entity that can or must be associated with each
instance of another entity.

Two of the most important types of constraints are:


1. The mapping cardinality of a relationship indicates the number of instances in entity E1 that can
or must be associated with instances in entity E2:

Unit: 4 – Introduction to the ER Model 20


DCA1208: Database Management System

a) One-One Relationship. For each entity instance in one entity, there is at most one
associated entity instance in the other entity.

The Characteristics are of follows:


a. Each row in Table A is linked to one and only one row in Table B.
b. Each row in Table B is linked to one and only one row in Table A.

For example, each person has one unique passport, and each passport is assigned to one person.

Tables:
• Person: PersonID, Name
• Passport: PassportID, PersonID, PassportNumber

Cardinality Notation in ER Diagrams:

1 1
Person Has Passport

Fig.12 One-One Relationship

b) Many-One Relationships. One entity instance in entity E2 is associated with zero or more
entity instances in entity E1, but each entity instance in E1 is associated with at most one entity
instance inE2.
• The Characteristics are as follows:
a. Each row in Table B can be linked to multiple rows in Table A.
b. Each row in Table A is linked to one and only one row in Table B.
For example, many employees work in one department, but each employee works in only
one department.

Tables:
• Employee: EmployeeID, Name, DepartmentID
• Department: DepartmentID, DepartmentName

Unit: 4 – Introduction to the ER Model 21


DCA1208: Database Management System

Cardinality Notation in ER Diagrams:

m 1
Employees Hav Department
e

Fig.13 Many-One Relationships

c) Many-Many Relationships There are no restrictions on how many entity instances in either
entity are associated with a single entity instance in the other.

The Characteristics are as follows:


• Each row in Table A can be linked to multiple rows in Table B.
• Each row in Table B can be linked to multiple rows in Table A.

An example of a many-to-many relationship would be students taking classes. Each student takes
many classes, and each class has many students.

Cardinality Notation in ER Diagrams:

m n
Class Has Students

Fig.14 Many-Many Relationships

Mapping cardinality is derived from cardinality constraints. The cardinality constraint between two
entities E1 and E2, denoted by (m,n), specifies that an instance in E1 appears in E2 at least m andat
most n times. Mapping cardinality takes the maximum number of cardinality constraints for each
entity in a relationship.

2. Existence dependence:
Existence dependence in a Database Management System (DBMS) refers to a situation where one
entity's existence is dependent on another entity's existence. In other words, an entity cannot exist
without being associated with another entity. This concept is often represented through weak
entities in an Entity-Relationship (ER) model.

Unit: 4 – Introduction to the ER Model 22


DCA1208: Database Management System

Key Characteristics of Existence Dependence

1. Weak Entities:
o A weak entity is one that does not have a sufficient unique identifier on its own. Instead, it
relies on a "strong" or "owner" entity to provide a unique context.
o The primary key of a weak entity is typically a combination of its own partial key and the
primary key of the strong entity it depends on.
2. Owner Entity:
o The strong entity that provides the primary key for the weak entity is known as the owner
entity. The weak entity cannot exist without its owner entity.
3. Identifying Relationship:
o The relationship between the weak entity and the strong entity is called an "identifying
relationship." This relationship helps uniquely identify the weak entity.
o In ER diagrams, ◦ The identifying relationship is often represented by a double line
connecting the weak entity to the owner entity.
4. Partial Key:
o A weak entity typically has a partial key (also called a discriminator) that uniquely identifies
its instances in conjunction with the primary key of the owner entity.

Why Existence Dependence Matters


o Data Integrity: Ensures that weak entities are properly linked to strong entities, maintaining
data integrity and consistency.
o Normalisation: Helps normalise the database design by properly defining entities and their
dependencies, reducing redundancy.
o Design Clarity: Clarifies how entities are related and depend on each other, making the
database design easier to understand and manage.
o Referential Integrity: Enforces referential integrity by ensuring that weak entities cannot exist
without their corresponding strong entities.

Unit: 4 – Introduction to the ER Model 23


DCA1208: Database Management System

3. SUMMARY
Let us summarise what we have studied in this unit:
• The Entity-Relationship (ER) model is a fundamental framework in database design that
helps conceptualise and structure data by representing entities, their attributes, and their
relationships. It provides a visual and conceptual way to model the data and its interactions
within a database system, facilitating a clear and organised approach to database design. The ER
model serves as a blueprint for creating a physical database schema, ensuring the data is
represented accurately and efficiently.
• Entities are the core components of the ER model, representing real-world objects or concepts
with distinct existence within the database. Each entity is characterised by its attributes, which
define its properties or characteristics. For example, in a university database, Student and
Course can be entities, with attributes like StudentID, Name, and CourseName describing their
respective properties. Entities can be simple or composite, and their attributes can include
primary keys uniquely identifying each entity instance.
• Relationships describe how entities interact with each other and establish connections
between them. The ER model depicts relationships as diamonds or lines connecting entities.
These relationships can be classified into various types based on their cardinality: one-to-one,
one-to-many, many-to-one, and many-to-many. For instance, a Student might be associated with
multiple Courses through an Enrolls relationship, while each course can have multiple students.
Relationships help understand how data is interrelated and facilitate efficient data retrieval and
manipulation.
• Keys are crucial elements in the ER model that ensure the uniqueness and integrity of data
within the database. The primary key uniquely identifies each record within an entity and
ensures that no two records are identical. In addition to primary keys, other types of keys
include candidate keys, which are potential primary keys, and foreign keys, which establish links
between related entities. Composite keys, consisting of multiple attributes, are used when a
single attribute is insufficient to identify a record uniquely. Keys are vital in maintaining data
accuracy and enabling effective data management.
• The Entity-Relationship model is an essential tool in database design that provides a structured
approach to representing data. Entities and attributes define the core components of the
database, relationships describe the interactions between entities, and keys ensure data

Unit: 4 – Introduction to the ER Model 24


DCA1208: Database Management System

integrity and uniqueness. Together, these elements form a comprehensive framework for
designing, implementing, and managing databases, ensuring that the data is well-organised,
accurate, and easily accessible.

Unit: 4 – Introduction to the ER Model 25


DCA1208: Database Management System

4. GLOSSARY

ER Model (Entity- A conceptual framework for data modelling that visually represents

Relationship - entities, their attributes, and their relationships. It is used to design and
Model) structure databases by defining the data and its interrelationships.

A database object or concept that can be distinctly identified and about


Entity - which data is stored. Examples include "Person," "Product," or "Course."

A property or characteristic of an entity that provides more detail about

Attribute - it. For instance, an "Employee" entity might have attributes like
"EmployeeID," "Name," and "HireDate."

A specific occurrence of an entity within a database. For example, a

Instance - particular employee named "John Doe" would be an instance of the


"Employee" entity.

DBMS (Database Software that facilitates database creation, management, and

Management - manipulation. It provides tools for storing, retrieving, and managing data
System) efficiently.

Specifies the kind of data that can be stored in an attribute, such as

Data Type - integers, strings, dates, or floating-point numbers. It determines the


nature of the data and its storage format.

An attribute or a set of attributes used to identify records in a table

Key - uniquely. Keys ensure that each record is distinct and can be accurately
retrieved or referenced.

Defines the number of instances of one entity that can be associated with

Cardinality - each instance of another entity. It specifies the maximum and minimum
number of occurrences in a relationship.

Unit: 4 – Introduction to the ER Model 26


DCA1208: Database Management System

5. SELF-ASSESSMENT QUESTIONS

SELF-ASSESSMENT QUESTIONS – 1
Fill in the blanks:
1 Which of the following is a primary purpose of the Entity-Relationship (ER) model?
A. To design the physical database schema
B. To provide a high-level conceptual framework for data modelling
C. To ensure data encryption
D. To manage database performance
2 What distinguishes a strong entity from a weak entity in the ER model?
A. Strong entities rely on other entities for identification, while weak entities do not.
B. Strong entities have a unique primary key, while weak entities do not have a unique
identifier on their own.
C. Strong entities can be abstract concepts, while weak entities are always physical objects.
D. Strong entities cannot participate in relationships, while weak entities can.
3 In an ER diagram, which term describes a collection of entities of the same type?
A. Entity Type
B. Entity Instance
C. Entity Set
D. Entity Attribute
4 What is an example of a derived attribute in the ER model?
A. Employee ID
B. Date of Birth
C. Age
D. Name
5 Which type of key uniquely identifies each instance of an entity within an entity type?
A. Foreign Key
B. Candidate Key
C. Primary Key
D. Composite Key
6 What does the term ‘relationship’ refer to in the ER model?

Unit: 4 – Introduction to the ER Model 27


DCA1208: Database Management System

A. A constraint that enforces data integrity


B. An association between entities that describes how they interact
C. A method for normalising database schemas
D. A specific attribute of an entity
7 Which of the following correctly describes a weak entity?
A. It has its own primary key and does not rely on any other entity.
B. It depends on a strong entity for its unique identification and lacks a primary key of its own.
C. It is always derived from another entity.
D. It must have a unique identifier that is independent of any other entity.
8 What is a ‘partial key’ in an ER model?
A. A unique key that combines multiple attributes to identify an entity.
B. A key uniquely identifies a weak entity in conjunction with a strong entity’s primary key.
C. A foreign key that refers to another entity.
D. A key used for defining relationships between entities.
9 Which term describes the set of possible values that an attribute can take?
A. Data Type
B. Domain
C. Instance
D. Entity Type
10 What role do attributes play in a database table?
A. They define the structure and relationships between tables.
B. They provide details about each entity and correspond to columns in a table.
C. They enforce constraints on the data stored in the database.
D. They represent the logical connections between different tables.
11 What is a simple attribute in the ER model?
A. An attribute that can be divided into smaller sub-parts
B. An attribute with multiple values for a particular entity
C. The smallest unit of data that cannot be subdivided
D. An attribute derived from other attributes
12 Which attribute type is represented by a double oval in an ER diagram?
A. Simple Attribute
B. Composite Attribute

Unit: 4 – Introduction to the ER Model 28


DCA1208: Database Management System

C. Single Attribute
D. Multi-Valued Attribute
13 How is a composite attribute different from a simple attribute?
A. A composite attribute cannot be further divided, while a simple attribute can be.
B. A composite attribute can be divided into smaller sub-parts, while a simple attribute cannot
be subdivided.
C. A composite attribute has only one value, while a simple attribute can have multiple values.
D. A composite attribute is always derived from other attributes.
14 Which of the following is an example of a derived attribute?
A. Employee ID
B. Phone Number
C. Total Marks
D. Address
15 What does a foreign key do in a database?
A. Uniquely identifies a record within its own table
B. Establishes a relationship between two tables
C. Enforces unique constraints on a table
D. Defines a composite key
16 Which key type is used to uniquely identify a record in a table and cannot be null?
A. Foreign Key
B. Composite Key
C. Primary Key
D. Candidate Key
17 What is a composite key in the context of database keys?
A. A key that is formed by a single attribute
B. A key that is used to identify records within its own table
C. A primary key composed of two or more attributes
D. A key used to reference another table's primary key
18 In an Is-a hierarchy, what does a subclass inherit from its superclass?
A. Only the attributes
B. Only the relationships
C. Both attributes and relationships

Unit: 4 – Introduction to the ER Model 29


DCA1208: Database Management System

D. None of the attributes or relationships


19 What does the cardinality notation (m,n) indicate in a relationship between two entities?
A. The minimum and maximum number of instances an entity can have.
B. The minimum and maximum number of times an instance in E1 can be associated with
instances in E2.
C. The number of attributes in an entity.
D. The exact number of instances required for each entity.
20 In a one-to-one relationship, how are the instances of the two entities related?
A. Each instance in one entity is associated with multiple instances in another.
B. Each instance in one entity is associated with exactly one instance in the other entity, and
vice versa.
C. Each instance in one entity can be linked to zero or more instances in the other entity.
D. Each instance in one entity can be linked to multiple instances in the other entity, but not
vice versa.
21 Which relationship type allows an instance in one entity to be associated with multiple
instances in another entity, but each instance in the latter is associated with at most one
instance in the former?
A. One-to-One Relationship
B. Many-to-Many Relationship
C. Many-to-One Relationship
D. One-to-Many Relationship
22 What characteristic defines a many-to-many relationship?
A. Each row in Table A can be linked to multiple rows in Table B, and each row in Table B can
be linked to multiple rows in Table A.
B. Each row in Table A is linked to only one row in Table B, and each row in Table B is linked to
multiple rows in Table A.
C. Each row in Table A can be linked to only one row in Table B, and each row in Table B is
linked to one row in Table A.
D. Each row in Table A is linked to exactly one row in Table B, and each row in Table B is linked
to exactly one row in Table A.
23 Which entity cannot exist without being associated with another entity?
A. Strong Entity

Unit: 4 – Introduction to the ER Model 30


DCA1208: Database Management System

B. Weak Entity
C. Derived Entity
D. Composite Entity
24 How is the identifying relationship between a weak entity and its owner entity represented in
an ER diagram?
A. By a single line
B. By a dashed line
C. By a double line
D. By a double oval
25 What role does the partial key play in a weak entity?
A. It uniquely identifies instances of the weak entity on its own.
B. It helps in linking the weak entity to the owner entity.
C. It defines the attributes of the weak entity.
D. It provides a unique identifier for the owner entity.

Unit: 4 – Introduction to the ER Model 31


DCA1208: Database Management System

6. TERMINAL QUESTIONS
1. Define an Entity.
2. What is an attribute?
3. List the characteristics of an Entity.
4. Classify the entity types.
5. Differentiate between a Strong entity vs. a Weak Entity.
6. Explain an Entity set.
7. How is an entity type different from an entity instance?
8. List the Characteristics of an attribute.
9. Classify the types of attributes.
10. What is a derived attribute?
11. Define a domain in DBMS.
12. Explain Is a hierarchies with an example.
13. Identify the importance of keys in DBMS
14. Classify the types of keys.
15. How is a foreign key different from a composite key?
16. Define cardinality.
17. Write a short note on types of cardinalities.
18. What is Existence Dependence?
19. List the Characteristics of Existence Dependence.
20. Why Existence Dependence Matters. Justify

Unit: 4 – Introduction to the ER Model 32


DCA1208: Database Management System

7. ANSWERS

7.1. Self -Self-Assessment Answers


1. B. To provide a high-level conceptual framework for data modelling
2. B. Strong entities have a unique primary key, while weak entities do not have a unique identifier
on their own.
3. C. Entity Set
4. C. Age
5. C. Primary Key
6. B. An association between entities that describes how they interact
7. B. It depends on a strong entity for its unique identification and lacks a primary key of its own.
8. B. A key that is used to uniquely identify a weak entity in conjunction with a strong entity’s
primary key.
9. B. Domain
10. B. They provide details about each entity and correspond to columns in a table.
11. C. The smallest unit of data that cannot be subdivided
12. D. Multi-Valued Attribute
13. B. A composite attribute can be divided into smaller sub-parts, while a simple attribute cannot
be subdivided.
14. C. Total Marks
15. B. Establishes a relationship between two tables
16. C. Primary Key
17. C. A primary key composed of two or more attributes
18. C. Both attributes and relationships
19. B. The minimum and maximum number of times an instance in E1 can be associated with
instances in E2.
20. B. Each instance in one entity is associated with exactly one instance in the other entity, and vice
versa.
21. C. Many-to-One Relationship
22. A. Each row in Table A can be linked to multiple rows in Table B, and each row in Table B can be
linked to multiple rows in Table A.
23. B. Weak Entity

Unit: 4 – Introduction to the ER Model 33


DCA1208: Database Management System

24. C. By a double line


25. B. It helps in linking the weak entity to the owner entity

7.2. Terminal Answers


Answer 1: An entity in a database is a distinct object or concept that can be uniquely identified. It
represents real-world objects, such as a person, place, or event, about which data can be stored. Refer
to section 2.1 for more details

Answer 2: An attribute is a property or characteristic of an entity that provides more information


about the entity. Attributes are used to describe the details of the entity, such as "Name," "Age," and
"Address" for a "Person" entity. Refer to section 2.1 for more details

Answer 3: The characteristics of an Entity.


• Uniqueness: Each entity must be distinguishable from others.
• Attributes: Entities have properties that describe their details.
• Relationships: Entities can participate in relationships with other entities.
• Integrity Constraints: Entities are subject to rules ensuring data accuracy. Refer to section 2.1
for more details

Answer 4: Entities can be classified into:


• Strong Entities: Have a unique identifier and do not depend on other entities.
• Weak Entities: Do not have a unique identifier and depend on strong entities for identification.
• Derived Entities: Attributes of these entities can be derived from other entities. Refer to section
2.1 for more details

Answer 5: A strong entity has a primary key that uniquely identifies each instance. In contrast, a weak
entity cannot be uniquely identified by its attributes alone and relies on a strong entity's primary key
for identification. For example, an "Order" (strong entity) can have multiple "OrderItems" (weak
entity) that depend on the "Order" for unique identification. Refer to section 2.1 for more details

Answer 6: An entity set is a collection of similar types of entities that share the same attributes. For
example, a set of all employees in a company forms an "Employee" entity set, where each employee
is an instance within the set. Refer to section 2.1 for more details

Answer 7: An entity type is a collection of entities that share the same attributes and define the
structure and properties of the entities. An entity instance is a specific, individual occurrence of an

Unit: 4 – Introduction to the ER Model 34


DCA1208: Database Management System

entity type. For instance, "Employee" is an entity type, while "John Doe" is an instance of the
"Employee" entity type. Refer to section 2.1 for more details

Answer 8: The Characteristics of an attribute.


• Name: Identifies the attribute.
• Data Type: Defines the kind of data stored (e.g., integer, string).
• Domain: The set of possible values an attribute can take.
• Atomicity: Whether the attribute is indivisible (simple) or composite.
• Multiplicity: Whether the attribute can have multiple values (multi-valued) or a single value.
Refer to section 2.1 for more details

Answer 9: The types of attributes.


• Simple Attributes: Indivisible, atomic attributes.
• Composite Attributes: Can be divided into smaller sub-parts.
• Single-Valued Attributes: Have a single value for each entity instance.
• Multi-Valued Attributes: Can have multiple values for each entity instance.
• Derived Attributes: Values can be derived from other attributes. Refer to section 2.1 for more
details

Answer 10: A derived attribute is an attribute whose value is calculated or derived from other
attributes. For example, "Age" can be derived from the "Date of Birth" attribute. Refer to section 2.1
for more details

Answer 11: A domain is the set of allowable values that an attribute can take. It defines the
permissible values for a given attribute, such as {Male, Female, Other} for a "Gender" attribute. Refer
to section 2.1 for more details

Answer 12: Is-a hierarchies represent inheritance relationships where a subclass inherits attributes
and relationships from a superclass. For example, in a database, a "Truck" is a subclass of
"Automobile" and inherits attributes like "Model" and "Serial Number" from the "Automobile"
superclass. Refer to section 2.2 for more details

Answer 13: Keys are crucial in DBMS for ensuring data integrity, establishing relationships between
tables, and enabling efficient data retrieval. They uniquely identify records, enforce uniqueness, and
maintain referential integrity across tables. Refer to section 2.3 for more details

Unit: 4 – Introduction to the ER Model 35


DCA1208: Database Management System

Answer 14: The types of keys.


• Primary Key: Uniquely identifies a record in a table.
• Candidate Key: A set of attributes that can uniquely identify a record; one is chosen as the
primary key.
• Super Key: A set of attributes, including the primary key, that can uniquely identify a record.
• Foreign Key: An attribute that creates a link between two tables.
• Composite Key: A primary key composed of multiple attributes. Refer to section 2.3 for more
details

Answer 15: A foreign key is an attribute or a set of attributes in one table referencing the primary
key in another table, establishing a relationship between the tables. A composite key is a primary key
composed of two or more attributes used to identify a record within the same table uniquely. Refer
to section 2.3 for more details

Answer 16: Cardinality specifies the number of instances of one entity that can or must be associated
with each instance of another entity in a relationship. It defines the minimum and maximum
occurrences of entities involved in a relationship. Refer to section 2.3 for more details

Answer 17: Types of cardinalities.


• One-to-One: Each instance of an entity relates to one instance of another entity.
• One-to-Many: One instance of an entity relates to multiple instances of another entity, but each
instance of the latter relates to only one instance of the former.
• Many-to-Many: Multiple instances of one entity relate to multiple instances of another entity
without restrictions. Refer to section 2.3 for more details

Answer 18: Existence dependence refers to a situation where one entity's existence relies on the
existence of another entity. This is typically seen in weak entities that cannot be uniquely identified
without a strong or owner entity. Refer to section 2.3 for more details

Answer 19: The Characteristics of Existence Dependence.


• Weak Entities: Cannot be uniquely identified by their attributes alone.
• Owner Entity: A strong entity that provides the primary key for the weak entity.
• Identifying Relationship: Connects the weak entity to the owner entity.
• Partial Key: A discriminator that uniquely identifies weak entity instances along with the
owner's primary key. Refer to section 2.3 for more details

Unit: 4 – Introduction to the ER Model 36


DCA1208: Database Management System

Answer 20: Existence dependence is important for maintaining data integrity, normalisation, and
referential integrity. It ensures that weak entities are properly linked to strong entities, reducing
redundancy and ensuring that dependent data is accurately represented and managed within the
database. Refer to section 2.3 for more details

Unit: 4 – Introduction to the ER Model 37


DCA1208: Database Management System

8. REFERENCES
1. Fundamentals of Database Systems, Ramez Elmasri and Shamkant B. Navathe, 7th edition,
Pearson, 2016, ISBN-10: 0133970779.

2. Database Management Systems, Raghu Ramakrishnan and Johannes Gehrke,3rd Edition,


McGraw-Hill Education,2002 (for the 3rd Edition), ISBN-13: 978-0072465631

3. Database System Concepts, Abraham Silberschatz, Henry F. Korth, S. Sudarshan,7th Edition,


McGraw-Hill Education, 2019, ISBN-10: 0073523321

Unit: 4 – Introduction to the ER Model 38


BACHELOR OF COMPUTER
DCA1208: Database Management System

APPLICATION
SEMESTER 2

DCA1208
DATABASE MANAGEMENT SYSTEM
Unit: 5 – ER Diagrams 1
DCA1208: Database Management System

Unit - 5
ER Diagrams

DCA324
KNOWLEDGE MANAGEMENT
Unit: 5 – ER Diagrams 2
DCA1208: Database Management System
TABLE OF CONTENTS

SL Fig No / Table SAQ /


Topic Page No
No / Graph Activity

1 Introduction - -
4-5
2.1 Objectives - -

2 The Entity-Relationship Model 1, 2 -


2.1 Components of ER Diagram - -
6-17
2.2 Symbols Used in the E-R Model - -
2.3 Some Examples of ER Diagrams - -
Converting ER Diagrams into Tables 3, 4, 5, 6, 7, 8,
3 - 18-25
9, 10, 11, 12
4 Summary - - 26-27

5 Glossary - - 28-29

6 Self-Assessment Question - 1 30-32

7 Terminal Questions - - 33

8 Answers - -
8.1 Self-Assessment Question - - 34-35
8.2 Terminal Questions - -

9 References - - 36

Unit: 5 – ER Diagrams 3
DCA1208: Database Management System

1. INTRODUCTION
In the previous unit, we studied the Entity-Relationship (ER) Model, exploring the fundamental
framework used in database design to represent data structures visually. We learnt how the ER model
helps define the data elements (entities) and their interrelationships, serving as a blueprint for
constructing databases. It simplifies the process of conceptualising and structuring data by using
diagrams to illustrate how different entities relate to one another. We also saw how this model is
especially useful during the initial stages of database design, as it provides a clear and organised way
to map out the data requirements before moving into more complex relational database schemas.

The ER model taught us that the primary components are crucial. Entities and Attributes refer to the
objects of interest in the database and their properties. Entities are usually nouns, such as "Student"
or "Course," while attributes provide additional details about these entities, like a student's name or
course credits. Relationships describe how entities interact with each other, indicating connections
like "Enrollment" between a student and a course. Keys are critical in the ER Model, with primary
keys uniquely identifying each entity instance, ensuring data integrity. Foreign keys are used to link
entities together, representing the relationships between them. Therefore, The ER Model provides a
clear, organised method for designing databases, ensuring the structure accurately reflects the real-
world scenario it intends to model.

In this unit, we will explore the foundational elements that make up an Entity-Relationship (ER)
diagram, which is a critical tool in database modelling. ER diagrams visually represent the structure
of a database by outlining entities, their attributes, and the relationships between them. We'll learn
the specifics of entities, objects or concepts that can be distinctly identified and their attributes, which
are properties or characteristics of these entities.

Moving on to The Entity-Relationship Model, this topic will cover the theoretical foundation of ER
diagrams, which includes the symbols used to represent entities, relationships, and attributes. We'll
learn about the standard notation for these components, such as rectangles for entities, diamonds for
relationships, and ovals for attributes. In the

section on Symbols Used in the E-R Model, we'll go deeper into the specific symbols and their
meanings, which are essential for correctly interpreting and creating ER diagrams. Following this, we

Unit: 5 – ER Diagrams 4
DCA1208: Database Management System

will study Some Examples of ER Diagrams to see how these concepts are applied in real-world
scenarios, providing practical insights into database design. Finally, the topic of Converting ER
Diagrams into Tables will teach us how to translate these visual representations into actual
database tables, a critical step in implementing a database design. This process involves mapping
entities to tables, attributes to columns, and relationships to foreign keys, ensuring that the database
structure accurately reflects the original ER diagram.

1.1. Objectives
By the end of the Unit 5, the learners should be able to
understand:
• Define the Entity-Relationship Model.
• Identify the Components of the ER Diagram.
• List the Symbols Used in the E-R Model.
• Explain how to Convert ER Diagrams into Tables

Unit: 5 – ER Diagrams 5
DCA1208: Database Management System

2. THE ENTITY-RELATIONSHIP MODEL


As we have discussed an introduction to the ER Model in the previous unit, let us quickly recap here.

The entity-relationship model is a tool for analysing an application's semantic features independent
of events. This approach includes a graphical notation depicting entity classes as rectangles,
relationships as diamonds, and attributes as circles or ovals. For a complex situation, a partial entity-
relationship diagram may present a summary of the entities and relationships that do not include the
details of the attributes.

The entity-relationship diagram provides a convenient method for visualisingthe interrelationships


among entities in a given application. This tool has proven helpful in transitioning from an
information application description to a formal database schema. The entity-relationship model is
used to describe the conceptual schema of an enterprise without

attention to the efficiency of the physical database design. The entity-relationship diagrams are then
turned into a logical schema in which the database is implemented.

2.1. Components of ER Diagram


1. Entity: An entity is a thing that exists and is distinguishable.
a) Entity instance. An instance is a particular occurrence of an entity. For example, each person
is an instance of an entity Person, each car is an instance of an entity Car, etc.
b) Entity class. A group of similar entities is called an entity class or entity type. An entity class
has common attributes.

2. Attributes: Attributes define a database's schema and are essential for ensuring data integrity,
retrieval, and manipulation within a DBMS. They describe the properties of entities and
relationships.
Types of Attributes in DBMS:
o Simple (Atomic) Attribute: An attribute that cannot be divided into smaller components.
Example: FirstName, Age, EmployeeID.
o Composite Attribute: An attribute that can be subdivided into smaller, meaningful
components, which themselves are attributes.
Example: Address might comprise Street, City, State, and ZipCode.

Unit: 5 – ER Diagrams 6
DCA1208: Database Management System

o Single-Valued Attribute: An attribute with a single value for each entity.


Example: SocialSecurityNumber, DateOfBirth.
o Multi-Valued Attribute: An attribute holding multiple values for each entity.
Example: Phone numbers for an employee might store multiple contact
numbers.
o Derived Attribute: An attribute whose value is calculated or derived from other attributes.
Example: Age could be derived from the DateOfBirth attribute.
o Domain: The conceptual definition of attributes is a named set of scalar values, all of the
same type, and a pool of possible values.
3. Relationships: A relationship is a connection between entities. For example, a relationship
between PERSONS and AUTOMOBILES could be an “OWNS” relationship. That is to say, people
own automobiles.
• Is-a hierarchies are a special type of relationship that allows attribute inheritance. For
example, to say that a truck is an automobile and an automobile has a model and serial
number implies that a truck also has a model and serial number.

4. Keys: A key is an identifier that uniquely differentiates one entity instance from all others in the
entity.
a) Primary Key: Identifier used to uniquely identify one particular instance of an entity.
A primary key
• can be one or more attributes (e.g., consider substituting asingle concatenated key
attribute for multiple attribute keys)
• must be unique within the domain (not just the current data set),
• its value should not change over time,
• must always have a value, and
• is created when no prominent attribute exists. Each instance hasa value.

b) Candidate Key: When multiple possible identifiers exist, each is a candidate key.
c) Concatenated Key: A key comprises parts that, when combined, become a unique
identifier. Multiple attribute keys are concatenated keys.
d) Borrowed Key Attributes: If an is-a relationship exists, the key of the more general entity
is also a key of the sub-entities. For example, if the serial number is a keyfor automobiles, it
would also be a key for trucks.

Unit: 5 – ER Diagrams 7
DCA1208: Database Management System

e) Foreign Keys. Foreign keys reference a related table through the primary key of that
related table.

An ER schema may identify certain constraints to which the data content must conform.

Two of the most important types of constraints are:


1. The mapping cardinality of a relationship indicates the number of instances in entity E1 that can
or must be associated with instances in entity E2:
a) One-One Relationship: For each entity instance in one entity, there is at most one
associated entity instance in the other. An example is a Person table linked to a passport
table, where each person has one unique passport, and each passport is assigned to one
person
b) Many-One Relationships: One entity instance in entity E2 is associated with zero or more
entity instances in entity E1. Still, at most, each entity instance in E1 is associated with one
entity instance inE2. For example, a woman may have many children, but a child has only
one birth mother.
c) Many-Many Relationships There are no restrictions on how many entity instances in
either entity are associated with a single entity instance in the other. An example of a many-
to-many relationship would be students taking classes. Each student takes many classes,
and each class has many students.
Mapping cardinality is derived from cardinality constraints. The cardinality constraint
between two entities E1 and E2, denoted by (m,n), specifies that an instance in E1 appears
in E2 at least m andat most n times. Mapping cardinality takes the maximum number of
cardinality constraints for each entity in a relationship.
2. Existence dependence: If the existence of an entity instance x depends on the existence of an
entity instance y, then x is said to be existence dependent on y. If y is deleted, so is x. For example,
loan_payment exists and is dependent on loan_number. If loan_number is deleted, so is
loan_payment.

Unit: 5 – ER Diagrams 8
DCA1208: Database Management System

2.2. Symbols Used in the ER Model


We will focus on the various graphical symbols fundamental to creating and understanding Entity-
Relationship (ER) diagrams. These symbols form the building blocks of ER models, allowing for a
transparent and standardised representation of the database structure.

Key symbols include:


• Entity: Represented by a rectangle, entities are objects or concepts within the domain of interest.
Each entity type has its rectangle; inside it, the entity's name is written.

• Weak Entity: Weak entities, which depend on another entity for their existence, are depicted
with double rectangles, and their identifying relationships are shown with double diamonds.

• Relationship: Relationships between entities are shown using diamonds. The diamond is
connected to the involved entities and describes the nature of the relationship.

• Identifying Relationship: This type of relationship, crucial for connecting weak entities to their
strong entities, is depicted using a double diamond. An identifying relationship indicates that
the weak entity's existence depends on the strong entity and that the weak entity's key is
partially or fully derived from the strong entity.

• Attributes: Attributes are properties or characteristics of entities and are depicted as ovals
connected to their respective entities. Variations include key attributes, which are underlined,
and multivalued attributes, which are depicted as double ovals.

• Key Attribute: A key attribute uniquely identifies each instance of an entity. It is underlined in
the diagram to distinguish it from other attributes.

Unit: 5 – ER Diagrams 9
DCA1208: Database Management System

• Multivalued Attribute: Represented by a double oval, a multivalued attribute indicates that an


entity can have multiple values for this attribute. For example, an entity "Person" might have a
multivalued attribute "Phone Numbers" if a person can have more than one phone number.

• Composite Attribute: A composite attribute consists of multiple components, which can be


broken down into more detailed attributes. It is depicted as an oval connected to smaller ovals,
each representing a component attribute.

• Derived Attribute: Represented by a dashed oval, a derived attribute is one whose value is
calculated or derived from other attributes.

• Cardinality: Cardinality is not always a separate symbol but is often represented alongside
relationships to indicate the nature of the relationship (e.g., one-to-one, one-to-many, or many-
to-many).

• Total Participation: Total participation indicates that every instance of an entity must
participate in a relationship. It is represented by a double line connecting the entity to the
relationship diamond.

• Structural Constraint: Structural constraints define the rules and limitations on the
relationships between entities, focusing on two main aspects: cardinality (which we’ve
discussed) and participation (total or partial). Cardinality constraints specify the maximum

Unit: 5 – ER Diagrams 10
DCA1208: Database Management System

number of times an entity can participate in a relationship, while participation constraints


indicate whether the participation is total or partial.

2.3. Some Examples of ER Model


Some Examples of ER Diagrams provide practical insights into how Entity-Relationship (ER)
models are used to design databases. These examples showcase how entities, attributes, relationships,
and constraints are organised to represent different scenarios or applications.

Here are a few illustrative examples:

1. Online Ticket & Event Management System

Fig 1. Online Ticket & Event Management System

Online Ticketing and Event Management Features

• User Registration: Allows users to create accounts and manage their profiles, and User
Authentication implements secure login mechanisms to authenticate users.
• Event Creation: This feature enables event organisers to create and manage events. It also
allows users to search for events based on criteria such as location, date, category, and keywords.

Unit: 5 – ER Diagrams 11
DCA1208: Database Management System

• Ticket Booking: This allows users to book tickets for events. Ticket Printing and Delivery also
provide users with options to print or receive tickets electronically.
• Payment Processing: Payments are processed securely for ticket bookings. Payment
Confirmation also confirms users' ticket purchases and payment status.
• Event Management: Provide tools for event organisers to manage event details, such as dates,
venues, and ticket prices.

Let us analyse the Entities and Attributes now:

1. Event: This table stores information about events.


• event_id: This is a unique identifier for each event, often an auto-incrementing integer value.
• name: The name of the event.
• description: A detailed event description, including its purpose, agenda, etc.
• date: The date on which the event occurs.
• time: The time of the event.
• location: The venue where the event occurs (a reference to the venue table using a foreign
key).
2. User: This table stores user information.
• user_id: This is a unique identifier for each user, often an auto-incrementing integer value.
• name: The user’s full name.
• email: A user's email address is used for login and communication purposes.
• password: The user’s password for secure login (needs to be hashed and stored securely).
• user_type: This specifies the type of user (e.g., organiser, attendee, administrator).
3. Venue: This table stores information about event venues.
• venue_id: This is a unique identifier for each venue, often an auto-incrementing integer value.
• name: The name of the venue.
• address: The full address of the venue.
• capacity: The maximum number of people the venue can hold.
• city: It is the name of the city where the venue is located.
• state: It is the name of the state where the venue is located.
• zip_code: The zip code of the venue’s location.
4. Ticket: This table stores information about individual event tickets.
• ticket_id: This is a unique identifier for each ticket, often an auto-incrementing integer value.

Unit: 5 – ER Diagrams 12
DCA1208: Database Management System

• event_id: The event this ticket is associated with (foreign key referencing the Event table).
• order_id: The order this ticket belongs to (foreign key referencing the Order table).
• price: The price paid for the ticket.
• type: It shows the type of ticket, for example, “General Admission” or “VIP.”
• seat_number: The seat number associated with this ticket (may be null for general
admission).
5. Order: This table stores information about user ticket orders.
• order_id: This is a unique identifier for each order, often an auto-incrementing integer value.
• user_id: The user who placed the order (foreign key referencing the User table).
• date: The date the order was placed.
• total_price: The total price of all tickets in the order.
• payment_id: The payment method used for the order (foreign key referencing the Payment
table).
6. Payment: This table stores information about payments made for orders.
• payment_id: This is a unique identifier for each payment, often an auto-incrementing integer
value.
• order_id: The order associated with the payment (foreign key referencing the Order table).
• payment_method: The method used for payment (e.g., “Credit Card”, “PayPal”).
• transaction_id: The unique identifier provided by the payment gateway for the transaction.
7. Speaker: This table stores information about speakers participating in events.
• speaker_id: This is a unique identifier for each speaker, often an auto-incrementing integer
value.
• name: The speaker’s full name.
• bio: A brief biography or description of the speaker.
• event_id: The event the speaker participates in (foreign key referencing the Event table).

Unit: 5 – ER Diagrams 13
DCA1208: Database Management System

2. ER diagram for an AIRLINE database schema.

This ER (Entity-Relationship) diagram represents a flight reservation system, showing how key
components such as airports, airplanes, flights, and reservations are interconnected. Let’s go step by
step to explain the entities, attributes, and relationships:

Entities and Attributes


1. Airport:
o Attributes include Airport_code, City, State, Name.
o Represents airports where flights depart and arrive.
o Linked to Flight_Leg through DEPARTURE_AIRPORT and ARRIVAL_AIRPORT.

2. Airplane_Type:
o Attributes include Type_name, Max_seats, Company.
o Represents a classification of aeroplanes based on capacity and manufacturer.

3. Airplane:
o Attributesinclude Airplane_id, Total_no_of_seats.
o Represents specific airplanes, each belonging to a particular Airplane_Type.

Unit: 5 – ER Diagrams 14
DCA1208: Database Management System

4. Flight:
o Attributes: Number, Airline, Weekdays.
o Represents an overarching flight, which may consist of multiple legs.

5. Flight_Leg:
o Attributes: Leg_no, Scheduled_dep_time, Scheduled_arr_time.
o Represents a segment of a flight, such as one portion of a journey from one airport to another.

6. Fare:
o Attributes: Code, Amount, Restrictions.
o Represents the cost and restrictions of tickets for flights.

7. Leg_Instance:
o Attributes: Date, No_of_avail_seats.
o Represents a specific instance of a flight leg on a given date.

8. Reservation:
o Attributes: Customer_name, Cphone.
o Represents the details of a booking made by a customer.

9. Seat:
o Attributes: Seat_no.
o Represents a specific seat assigned to a customer as part of their reservation.

Unit: 5 – ER Diagrams 15
DCA1208: Database Management System

3. ER Diagram of Hospital Management System

This ER Diagram represents a Hospital Management System and explains the relationships between
different entities in managing hospital operations, such as patients, doctors, nurses, rooms, bills, and
records. Let’s go step by step:

Fig 2. ER Diagram of Hospital Management System

Entities and Attributes


1. Patient:
o Attributes: P-ID (Patient ID), Name, DOB, Gender, Age, Mob-No.
o Represents patients in the hospital.

2. Doctor:
o Attributes: Dept (Department), Qualification.
o Represents doctors who consult patients.

3. Nurse:
o Inherits from Employee (has attributes E-ID, Name, Salary, Sex, Mob-No, Address, State, City,
Pin-no).
o Represents nurses who govern rooms and assist doctors.

4. Receptionist:

Unit: 5 – ER Diagrams 16
DCA1208: Database Management System

o Also inherits from Employee.


o Represents receptionists who maintain patient records.

5. Rooms:
o Attributes: R-ID (Room ID), Type, Capacity, Availability.
o Represents rooms available in the hospital, assigned to patients.

6. Bills:
o Attributes: B-ID (Bill ID), Amount.
o Represents bills generated for patients.

7. Test Report:
o Attributes: Result, Test-Type, P-ID (Patient ID), R-ID (Room ID).
o Represents diagnostic or medical test results for patients.

8. Records:
o Attributes: Record-no, App-no.
o Represents records maintained by receptionists for patients’ history, appointments, and
other hospital-related activities.

Unit: 5 – ER Diagrams 17
DCA1208: Database Management System

3. CONVERTING ER DIAGRAMS INTO TABLES

Each diagram component is systematically mapped to database tables and columns when converting
an ER diagram into a relational database schema.

Here’s how the mapping is typically done, step by step:

Mapping of Mapping of Binary


Mapping of Weak
Regular Entity 1:1 Relationship
Entity Types
Types Types

Mapping of Mapping of Binary Mapping of Binary


Multivalued M Relationship 1 Relationship
Attributes Types Types

Mapping of N-ary
Relationship
Types

Fig4. Steps in Converting ER Diagrams into Tables

Step 1: Mapping of Regular Entity Types


• Entities in the ER diagram are mapped to tables in the relational database.
• Each entity's attribute becomes a column in the table, with the key attribute becoming the
table's primary key.
• Example: If you have an entity Customer with attributes CustomerID, Name, and Address, you
would create a table named Customer with these columns, and CustomerID would be the
primary key.

Step 2: Mapping of Weak Entity Types


• Weak entities cannot be uniquely identified by their own attributes alone and rely on a related
strong entity.

Unit: 5 – ER Diagrams 18
DCA1208: Database Management System

• A table is created for the weak entity, including its attributes and the primary key of the related
strong entity.
• The primary key of the weak entity table is a combination of its own attributes (if any) and the
primary key of the strong entity.

• Example: If you have a weak entity Dependent related to Employee, you create a Dependent
table with attributes like DependentName and EmployeeID, where EmployeeID comes from the
Employee table. The primary key would be a combination of DependentName and EmployeeID.

Step 3: Mapping of Binary 1:1 Relationship Types


• In a 1:1 relationship, each entity instance is associated with at most one instance of another
entity.
• The relationship can be mapped by adding the primary key of one entity as a foreign key in
the table of the other entity.
• If the relationship is total (i.e., all instances must participate), combining the two entities into a
single table may be better.
• Example: If there is a 1:1 relationship between an Employee and an Office, you could add
OfficeID as a foreign key in the Employee table or vice versa, depending on the scenario.

Step 4: Mapping of Binary 1 Relationship Types


• In a 1 relationship, one entity is associated with multiple instances of another entity.
• The primary key of the "one" side entity is added as a foreign key in the table corresponding to
the "many" side entity.
• Example: If the Department and Employee are in a 1

relationship (a department has many employees), add DeptID as a foreign key in the Employee table.

Step 5: Mapping of Binary M Relationship Types


• For M relationships, where multiple instances of one entity are associated with multiple
instances of another, a new table (called a junction table or associative entity) is created.
• This new table includes the primary keys of the related entities as foreign keys and may also
include other attributes specific to the relationship.
• The combination of these foreign keys often forms the primary key of the junction table.
• Example: To establish a relationship between a Student and a Course, a StudentCourse table is
created with StudentID and CourseID as foreign keys and a primary key made up of both.

Unit: 5 – ER Diagrams 19
DCA1208: Database Management System

Step 6: Mapping of Multivalued Attributes


• Multivalued attributes can have multiple values for a single entity instance.
• A new table is created for the multivalued attribute, including the primary key of the original
entity and the multivalued attribute itself.
• The primary key of this new table is usually a combination of the original entity's primary key
and the multivalued attribute.
• Example: If Customer has a multivalued attribute PhoneNumber, create a CustomerPhone table
with CustomerID and PhoneNumber. The primary key would be the combination of CustomerID
and PhoneNumber.

Step 7: Mapping of N-ary Relationship Types


• For N-ary relationships (relationships involving more than two entities), a new table is created
to represent the relationship.
• This table includes the primary keys of all participating entities as foreign keys.
• The combination of these foreign keys typically forms the table's primary key.
• Example: In a relationship involving Supplier, Part, and Project, a new table (e.g.,
SupplierPartProject) would be created with foreign keys SupplierID, PartID, and ProjectID.
Additional attributes specific to the relationship can also be added to this table.

By following these steps, you can systematically map an ER diagram into a set of relational tables that
accurately represent the structure and constraints of the original model.

Let us understand better with an example:

Consider the ER conceptual schema diagram for the COMPANY database as shown.

Unit: 5 – ER Diagrams 20
DCA1208: Database Management System

Fig 5. ER conceptual schema diagram for the COMPANY database

Step 1: Mapping of Regular Entity Types: we create the relations EMPLOYEE, DEPARTMENT, and
PROJECT in Figure 9.2 to correspond to the regular entity types EMPLOYEE, DEPARTMENT, and
PROJECT in Figure. The foreign key and relationship attributes, if any, are not included yet; they will
be added during subsequent steps. These include the attributes Super_ssn and Dno of EMPLOYEE,
Mgr_ssn and Mgr_start_date of DEPARTMENT, and Dnum of PROJECT. In our example, we choose Ssn,
Dnumber, and

Pnumber as primary keys for the relations EMPLOYEE, DEPARTMENT, and PROJECT, respectively.
The knowledge that the Dname of the DEPARTMENT and the Pname of the PROJECT are secondary
keys are kept for possible use later in the design.

The relations created from mapping entity types are sometimes called entity relations because each
tuple represents an entity instance. The result after this mapping step is shown below.

Unit: 5 – ER Diagrams 21
DCA1208: Database Management System

EMPLOYEE

DEPARTMENT

PROJECT

Fig6. Entity relations after step 1.

Step 2: Mapping of Weak Entity Types:

In this step, we create the relation DEPENDENT to correspond to the weak entity type DEPENDENT.
We include the primary key Ssn of the EMPLOYEE relation—which corresponds to the owner entity
type—as a foreign key attribute of DEPENDENT; we rename it Essn, although this is not necessary.

The primary key of the DEPENDENT relation is the combination {Essn, Dependent_name} because
Dependent_name is the partial key of DEPENDENT.

DEPARTMENT

Fig7. Additional weak entity relation after step 2.

Unit: 5 – ER Diagrams 22
DCA1208: Database Management System

Fig 8: Result of mapping the COMPANY ER schema into a relational database schema.

Step 3: Mapping of Binary 1:1 Relationship Types:


3.1 Foreign key approach: In our example, we map the 1:1 relationship type MANAGES from Figure
9.1 by choosing the participating entity type DEPARTMENT to serve in the role of S because its
participation in the MANAGES relationship type is total (every department has a manager). We
include the primary key of the EMPLOYEE relation as a foreign key in the DEPARTMENT relation and
rename it Mgr_ssn. We also include the simple attribute Start_date of the MANAGES relationship type
in the DEPARTMENT relation and rename it Mgr_start_date, as seen in Figure 8 above.

Step 4: Mapping of Binary 1: N Relationship Types:


Our example maps the 1: N relationship types of WORKS_FOR, CONTROLS, and SUPERVISION. For
WORKS_FOR, we include the primary key Dnumber of the DEPARTMENT relation as a foreign key in
the EMPLOYEE relation and call it Dno. For SUPERVISION, we include the primary key of the
EMPLOYEE relation as a foreign key in the EMPLOYEE relation itself—because the relationship is
recursive— and call it Super_ssn. The CONTROLS relationship is mapped to the foreign key attribute
Dnum of PROJECT, which references the primary key Dnumber of the DEPARTMENT relation as
depicted in Fig 8.

Unit: 5 – ER Diagrams 23
DCA1208: Database Management System

Step 5: Mapping of Binary M: N Relationship Types:


In our example, we map the M: N relationship type WORKS_ON by creating the relation WORKS_ON
in Figure 8. We include the primary keys of the PROJECT and EMPLOYEE relations as foreign keys in
WORKS_ON and rename them Pno and Essn, respectively. We also include an attribute Hours in
WORKS_ON to represent the Hours attribute of the relationship type. The primary key of the
WORKS_ON relation is the combination of the foreign key attributes {Essn, Pno}.

This relationship relation is shown below:

Fig 9: Relationship relation after step 5.

Step 6: Mapping of Multivalued Attributes:

In our example, we create a relation DEPT_LOCATIONS. The attribute Dlocation represents the
multivalued attribute LOCATIONS of DEPARTMENT, while Dnumber—as foreign key—represents
the primary key of the DEPARTMENT relation. The primary key of DEPT_LOCATIONS is the
combination of {Dnumber, Dlocation}. A separate tuple will exist in DEPT_LOCATIONS for each
location that a department has.

Fig 10: Relation representing multivalued attribute after step 6.

Step 7: Mapping of N-ary Relationship Types:


For example, consider the relationship type SUPPLY in Figure 11. This can be mapped to the relation
SUPPLY shown in Figure 12, whose primary key combines the three foreign keys {Sname, Part_no,
Proj_name}.

Unit: 5 – ER Diagrams 24
DCA1208: Database Management System

Fig 11. The SUPPLY relationship.

Fig 12. Mapping the n-ary relationship type SUPPLY

Unit: 5 – ER Diagrams 25
DCA1208: Database Management System

4. SUMMARY

The Entity-Relationship (ER) Model is a fundamental conceptual framework used in database design
to visually represent data and their relationships within a system. At its core, the ER model helps
structure and define entities (real-world objects or concepts) and their relationships.

This model is widely used in the early stages of database design, providing a clear and concise way to
map out the data requirements and relationships in a system before implementation. The ER model
typically includes entities, attributes (characteristics of entities), and relationships, which define how
entities are connected. This model serves as a blueprint for creating a relational database schema.

ER diagrams are visual representations of the ER model with several vital components. The primary
components include entities, depicted as rectangles, and attributes, represented as ovals connected
to their respective entities.

Relationships between entities are shown as diamonds, with lines connecting them to the involved
entities. Relationships can have their own attributes, known as descriptive attributes. They may also
have cardinality constraints, which define the nature of the relationship (e.g., one-to-one, one-to-
many, or many-to-many). Other components include

primary keys, uniquely identifying each entity instance, and foreign keys, which establish connections
between different entities.

The ER model uses a set of standard symbols to represent various components and their interactions.
Rectangles usually symbolise entities, while attributes are shown as ovals attached to their respective
entities.

Diamonds represent relationships between entities, with lines connecting them to those involved.
Crow’s foot notation is often used to denote cardinality constraints, where a single line represents a
one-to-one relationship, a line with a crow’s foot denotes a one-to-many relationship, and a crow’s
foot at both ends indicates a many-to-many relationship. Double rectangles and double ovals
represent weak entities and multivalued attributes, respectively.

Once an ER diagram is completed, the next step in database design is converting the ER model into a
relational schema, where entities, attributes, and relationships are transformed into tables, columns,
and foreign keys. Entities in the ER diagram typically become tables, with their attributes as columns.

Unit: 5 – ER Diagrams 26
DCA1208: Database Management System

Relationships are converted into foreign keys, linking related tables or, in some cases, into separate
tables to handle many-to-many relationships. This process involves ensuring that all entities and
relationships are appropriately normalised to avoid redundancy and ensure data integrity.
Converting ER diagrams into tables is crucial in transitioning from conceptual design to actual
database implementation.

Unit: 5 – ER Diagrams 27
DCA1208: Database Management System

5. GLOSSARY

The Entity-Relationship (ER) Model is a conceptual framework used in


database design to represent data objects (entities), attributes, and
ER Model - relationships. It provides a visual diagram (ER diagram) to map out the

data structure before implementing a relational database.

Is-a hierarchies represent inheritance relationships in database


modelling, where a subtype entity inherits attributes and relationships
Is-a Hierarchies - from a supertype entity. This is commonly used when entities share

common characteristics and distinct attributes.

Constraints in databases are rules that enforce data integrity and


consistency. They define limitations on the data types, relationships, and

Constraints - operations allowed on the data, such as primary key constraints ensuring
unique identification or foreign key constraints maintaining referential
integrity.

Relationship types define the associations between entities in an ER

Relationship model, characterised by cardinality (e.g., one-to-one, one-to-many, many-


- to-many) and the nature of the interaction. These relationships are
Types
represented by diamonds in ER diagrams, linking related entities.

Multivalued attributes are attributes in an ER model that can have

Multivalued multiple values for a single entity instance. For example, a Person entity
- might have a multivalued attribute like Phone Numbers, where one
Attributes
person can have several phone numbers.

Regular entity types, also known as strong entities, can exist

Regular Entity independently of other entities in a database. They have a primary key to
- identify each instance uniquely and are typically represented by a single
Types
rectangle in ER diagrams.

Unit: 5 – ER Diagrams 28
DCA1208: Database Management System

A primary key is a unique identifier for each record in a database table,

Primary Key - ensuring that no two records have the same key value. It is essential for
establishing relationships between tables and enforcing entity integrity.

A foreign key is an attribute in one table that links to the primary key of
another table, creating a relationship between the two tables. It ensures
Foreign Key - referential integrity by ensuring that the value in the foreign key column

exists in the referenced table.

Unit: 5 – ER Diagrams 29
DCA1208: Database Management System

6. SELF-ASSESSMENT QUESTIONS

SELF-ASSESSMENT QUESTIONS – 1
Fill in the blanks:
1 What is the primary function of an Entity-Relationship (ER) model in database design?
A) To define the physical storage structure of the database
B) To analyse an application's semantic features independently of events
C) To optimise query performance
D) To manage user access control
2 In an ER diagram, what symbol represents relationships between entities?
A) Rectangle
B) Oval
C) Diamond
D) Circle
3 Which of the following best describes a composite attribute?
A) An attribute that has multiple values for a single entity instance
B) An attribute that cannot be further subdivided
C) An attribute that is atomic and has no internal structure
D) An attribute that can be subdivided into smaller parts, such as an address
4 Which of the following is true about a primary key in an ER model?
A) It can have duplicate values within the same domain
B) It must always have a value and be unique within its domain
C) It can be composed of multiple values from different entities
D) It is only used to link to other tables in the database
5 What is a "borrowed key" attribute in the context of is-a hierarchies?
A) A key borrowed from another table to create a foreign key relationship
B) A key used to reference external databases
C) A key that is inherited from a more general entity in an is-a relationship
D) A key that is used temporarily during database transactions
6 What does mapping cardinality in an ER model indicate?
A) The types of attributes in an entity

Unit: 5 – ER Diagrams 30
DCA1208: Database Management System

B) The number of instances in one entity associated with instances in another entity
C) The total number of entities in the database
D) The hierarchical relationship between entities
7 Which of the following best describes a one-to-one relationship?
A) One entity instance in E1 is associated with many instances in E2
B) Many instances in E1 are associated with one instance in E2
C) Each instance in one entity is associated with at most one instance in another entity
D) Many instances in E1 are associated with many instances in E2
8 How is a weak entity represented in an ER diagram?
A) By a single rectangle
B) By a double rectangle
C) By a diamond
D) By a double diamond
9 What is the purpose of identifying relationships in an ER model?
A) To indicate a derived attribute
B) To connect multivalued attributes to an entity
C) To link weak entities to their corresponding strong entities
D) To establish a one-to-many relationship between two entities
10 Which symbol in an ER diagram represents a multivalued attribute?
A) Oval
B) Double oval
C) Rectangle
D) Dashed oval
11 What is the first step in converting an ER diagram into a relational database schema?
A) Mapping of Binary 1:1 Relationship Types
B) Mapping of Weak Entity Types
C) Mapping of Regular Entity Types
D) Mapping of Multivalued Attributes
12 How are weak entities typically mapped in a relational database?
A) By creating a table with its own primary key
B) By combining it with a strong entity into a single table
C) By creating a table without any foreign key references

Unit: 5 – ER Diagrams 31
DCA1208: Database Management System

D) By creating a table that includes the primary key of the related strong entity as part of its
primary key
13 What approach maps a binary many-to-many (M) relationship in a relational database?
A) Creating two tables, one for each entity involved in the relationship
B) Adding a foreign key in one of the existing tables
C) Creating a new table that includes the primary keys of the related entities as foreign keys
D) Using a single table with composite primary keys
14 When mapping an ER diagram to a relational schema, which step involves creating a new table
for multivalued attributes?
A) Step 3: Mapping of Binary 1:1 Relationship Types
B) Step 5: Mapping of Binary M Relationship Types
C) Step 6: Mapping of Multivalued Attributes
D) Step 7: Mapping of N-ary Relationship Types
15 When mapping a binary one-to-many (1) relationship, where is the primary key of the "one"
side entity added?
A) In the table corresponding to the "many" sides entity as a foreign key
B) In the table corresponding to the "one" side entity as a foreign key
C) In a separate junction table
D) As a composite key in both related tables

Unit: 5 – ER Diagrams 32
DCA1208: Database Management System

7. TERMINAL QUESTIONS

1. Define an Entity-Relationship Model.


2. List the Components of the ER Diagram.
3. Differentiate between Simple and composite attributes.
4. What are Is-a hierarchies?
5. Define Primary Keys.
6. Explain Borrowed Key Attributes.
7. Differentiate between Candidate Key and Concatenated Key.
8. Describe a few important Symbols Used in the ER Model.
9. Identify the steps in Converting ER Diagrams into Tables.
10. Consider an ER conceptual schema diagram for the COMPANY database convert into tables with
specific steps.

Unit: 5 – ER Diagrams 33
DCA1208: Database Management System

8. ANSWERS

8.1. Self-Assessment Answers:


1. B) To analyse an application's semantic features independently of events
2. C) Diamond
3. D) An attribute that can be subdivided into smaller parts, such as an address
4. B) It must always have a value and be unique within its domain
5. C) A key that is inherited from a more general entity in an is-a relationship
6. B) The number of instances in one entity associated with instances in another entity
7. C) Each instance in one entity is associated with at most one instance in another entity
8. B) By a double rectangle
9. C) To link weak entities to their corresponding strong entities
10. B) Double oval
11. C) Mapping of Regular Entity Types
12. D) By creating a table that includes the primary key of the related strong entity as part of its
primary key
13. C) Creating a new table that includes the primary keys of the related entities as foreign keys
14. C) Step 6: Mapping of Multivalued Attributes
15. A) In the table corresponding to the "many" side entity as a foreign key

8.2. Terminal Answers:


Answer 1: An Entity-Relationship (ER) Model is a conceptual framework used in database design to
represent a database's structure visually. It shows entities, their attributes, and their relationships,
helping map out data requirements before implementing the database. Refer to section 2 for more
details.

Answer 2: The key components of an ER diagram include entities (represented by rectangles),


attributes (ovals), relationships (diamonds), and various constraints such as primary keys and
foreign keys, as well as cardinality and participation constraints. Refer to section 2.1 for more
details.

Unit: 5 – ER Diagrams 34
DCA1208: Database Management System

Answer 3: Simple attributes are indivisible, atomic values like a person's first name, while composite
attributes can be broken down into smaller components, such as an address, subdivided into street,
city, and zip code. Refer to section 2.1 for more details.

Answer 4: Is-a hierarchies represent inheritance relationships in databases, where a subtype entity
inherits attributes and relationships from a supertype entity. This hierarchy allows for shared
attributes among related entities, modelling a generalised-to-specific relationship. Refer to section
2.1 for more details.

Answer 5: Primary keys are unique identifiers for each record in a database table. They ensure that
each record can be uniquely distinguished from all others, are mandatory, and should never change
over time. Refer to section 2.1 for more details.

Answer 6: Borrowed key attributes refer to primary keys from a strong entity inherited by a weak
entity in an "is-a" relationship. These borrowed keys help uniquely identify instances of the weak
entity by combining them with its own attributes. Refer to section 2.1 for more details.

Answer 7: A candidate key is any attribute or combination of attributes that can uniquely identify a
record in a table, while a concatenated key specifically refers to a key formed by combining multiple
attributes to create a unique identifier. Refer to section 2.1 for more details.

Answer 8: Key symbols in an ER model include rectangles for entities, ovals for attributes, diamonds
for relationships, double rectangles for weak entities, double ovals for multivalued attributes, and
underlined ovals for key attributes. Refer to section 2.2 for more details.

Answer 9: Steps include mapping regular entities to tables, mapping weak entities by including
foreign keys from strong entities, mapping binary 1:1 and 1 relationships by adding foreign keys,
mapping M relationships using junction tables, and handling multivalued and N-ary relationships
with separate tables. Refer to section 3 for more details.

Answer 10: To convert an ER diagram for the COMPANY database into tables, map each entity to a
table with attributes as columns. Include primary keys, map weak entities by adding foreign keys,
handle 1:1 and 1 relationships by adding foreign keys, create junction tables for M relationships, and
map multivalued attributes into separate tables. Refer to section 3 for more details.

Unit: 5 – ER Diagrams 35
DCA1208: Database Management System

9. REFERENCES

1. Fundamentals of Database Systems, Ramez Elmasri and Shamkant B. Navathe, 7th edition,
Pearson, 2016, ISBN-10: 0133970779.
2. Database Management Systems, Raghu Ramakrishnan and Johannes Gehrke,3rd Edition,
McGraw-Hill Education,2002 (for the 3rd Edition), ISBN-13: 978-0072465631
3. Database System Concepts, Abraham Silberschatz, Henry F. Korth, S. Sudarshan,7th Edition,
McGraw-Hill Education, 2019, ISBN-10: 0073523321

Unit: 5 – ER Diagrams 36
BACHELOR OF COMPUTER APPLICATION
DCA1208: Database Management System

SEMESTER 2

DCA1208
DATABASE MANAGEMENT SYSTEM
Unit: 6 – Introduction to the Relational Model 1
DCA1208: Database Management System

Unit - 6
Introduction to the Relational Model

DCA324
KNOWLEDGE MANAGEMENT
Unit: 6 – Introduction to the Relational Model 2
DCA1208: Database Management System

TABLE OF CONTENTS

SL Fig No / Table SAQ /


Topic Page No
No / Graph Activity

1 Introduction - -
4-5
1.1 Objectives - -

2 Concepts of Relational Model - -


2.1 Components of a Relation - -
2.2 Example of a Relation - -
2.3 Representation of a Relation - -
6-15
2.4 Properties of a Relation - -
2.5 Schema and Instance - -
2.6 Integrity Constraints - -
2.7 Advantages of Relational Model - -

3 Terminologies - - 16

4 Codd's Rules - - 17-20

5 Keys in the Relational Model - - 21-24

6 Summary - - 25

7 Glossary - - 26

8 Terminal Questions - - 27-29

9 Self-Assessment Questions - 1 30

10 Answers - -
10.1 Self-Assessment Questions - - 31-33
10.2 Terminal Questions - -

11 References - - 34

Unit: 6 – Introduction to the Relational Model 3


DCA1208: Database Management System

1. INTRODUCTION
In the previous unit, we explored the Entity-Relationship (ER) Model, a conceptual framework used
for database design to represent data and its relationships. We studied the key components of ER
diagrams, including entities (objects), attributes (properties of entities), and relationships
(associations between entities). The symbols used in the ER model were discussed, such as
rectangles for entities, diamonds for relationships, ovals for attributes, and lines to connect them.
Practical examples of ER diagrams demonstrated their application in real-world scenarios, like
designing a student database or inventory system. Finally, we learned how to convert ER diagrams
into relational tables, a critical step for implementing a database by mapping entities to tables,
attributes to columns, and relationships to foreign keys, ensuring the logical structure aligns with the
relational model. This unit provides a foundation for effective database design and implementation.

In the current unit, we see the relational model concepts foundational to database management
systems. We begin by exploring the components of a relation, including attributes, tuples, domains,
and their roles in structuring data in a table format. Using examples, we illustrate how relations
represent real-world entities and examine a relation's mathematical and logical representation. Key
properties of a relation, such as uniqueness, atomicity, and order irrelevance, are discussed to
understand their role in maintaining data consistency. Additionally, the unit covers schema and
instance, emphasising the distinction between the structural definition of a relation and its dynamic
content. A detailed exploration of integrity constraints, including entity and referential integrity,
highlights how databases enforce data accuracy. Finally, the advantages of the relational model, such
as simplicity, data independence, and scalability, are outlined to emphasise its effectiveness in
modern database systems.

We also study essential terminologies like attributes, cardinality, and degree, which are critical for
understanding relational databases. The unit introduces Codd's 12 Rules, a set of guidelines for
evaluating relational database systems, focusing on data accessibility, integrity, and independence.
The discussion on keys in the relational model highlights their importance in uniquely identifying
data and maintaining relationships between tables. Types of keys, including a primary, foreign,
candidate, and composite keys, are explained with examples to show their role in ensuring data

Unit: 6 – Introduction to the Relational Model 4


DCA1208: Database Management System

integrity. Through these topics, the unit provides a comprehensive understanding of the relational
model's structure, rules, and benefits, forming the basis for effectively managing data in a database.

1.1. Objectives:
• Discuss the Concepts of a relational model
• List the advantages of a relational model
• Explain the Codd's Rules
• Classify the different keys in a relational model

Unit: 6 – Introduction to the Relational Model 5


DCA1208: Database Management System

2. CONCEPTS OF RELATIONAL MODEL


In the relational model, a relation is a foundational concept that organises data into a tabular structure,
making it intuitive and easy to manipulate. It is a systematic way of storing and representing data,
where rows and columns correspond to the data's entities and their attributes, respectively.

A relation is a table-like structure that organises data into rows and columns:
1. Rows (Tuples): Represent individual records or instances of data.
2. Columns (Attributes): Represent the data's properties, features, or characteristics.
3. Domain: Specifies the permissible set of values for each attribute in the table.

2.1. Components of a Relation


• Attributes (Columns): Each column in a table is called an attribute. Attributes define the table's
structure and specify the type of data stored.

o Example:
▪ In the table Student, the attributes are StudentID, Name, Age, and Major.
▪ These represent the properties of the student entity.
• Tuples (Rows): Each row in a table is called a tuple. Tuples represent individual data entries or
instances.

o Example:
▪ In the Student table, the first tuple represents one student's record (101, Alice, 20,
Computer Sci).
• Domain: The domain of an attribute specifies the range of permissible values for that attribute.

o Example:
▪ StudentID: Must be an integer.
▪ Name: A string with a maximum of 50 characters.
▪ Age: An integer between 18 and 25.

Unit: 6 – Introduction to the Relational Model 6


DCA1208: Database Management System

2.2. Example of a Relation


Consider the Student relation:

1. Attributes:
o The table contains four attributes:
▪ StudentID: A unique identifier for each student.
▪ Name: The student's name.
▪ Age: The student's age.
▪ Major: The student's area of study.

2. Tuples:
o Each row represents a tuple:
▪ (101, Alice, 20, Computer Sci)
▪ (102, Bob, 21, Mathematics)
▪ (103, Carol, 22, Physics)

3. Domain:
o StudentID: Integer values.
o Name: Strings of up to 50 characters.
o Age: Integers in the range of 18 to 25.
o Major: Strings from a predefined set of values, e.g., {"Computer Sci", "Mathematics",
"Physics"}.

2.3. Representation of a Relation:


A relation can also be expressed mathematically as a set of tuples.

Using the Student relation:


Relation Schema: R(StudentID,Name,Age,Major)

Unit: 6 – Introduction to the Relational Model 7


DCA1208: Database Management System

Instance (set of tuples at a particular time):


R= {(101,Alice,20,Computer Sci),(102,Bob,21,Mathematics),(103,Carol,22,Physics)}

2.4. Properties of a Relation


In the relational model, relations must adhere to specific properties to ensure data integrity,
consistency, and effective organisation. These properties make relations robust and reliable for
managing structured data in databases.

The key properties include Uniqueness, Atomicity, and Order Irrelevance, explained as follows:

Properties of a
Relation

Uniqueness Atomicity Order Irrelevance

Fig1. Properties of a Relation

1. Uniqueness: This property ensures that no two tuples (rows) in a relation can have identical
values across all attributes. Each tuple must be distinguishable from others, typically enforced
through a primary key or unique constraints.
o A primary key uniquely identifies each tuple in a relation.
o Duplicate rows in a table violate the relational model's rules, leading to redundancy and
inconsistency.

Example: Consider the following Student relation:

Unit: 6 – Introduction to the Relational Model 8


DCA1208: Database Management System

This example repeats the tuple (101, Alice, 20, Computer Science), which violates the uniqueness
property because StudentID, the primary key, must uniquely identify each student. To resolve this,
duplicate rows must be removed, or the data must be corrected.

2. Atomicity: The atomicity property dictates that each relationship value must be indivisible. In
other words, each attribute should hold a single atomic value rather than a composite or nested
set of values. This ensures data consistency and simplifies query processing.
o Atomic values mean data cannot be broken down further for meaningful interpretation.
o The use of non-atomic values (e.g., storing multiple names in a single cell) violates this
property and complicates database operations like searching and filtering.

Example: Consider the following Student relation:

In this case, the second tuple violates the atomicity property because the Name attribute contains a
composite value (Bob and Carol). To adhere to atomicity, the data should be split into separate rows
or managed using relationships across multiple tables:

Correct Representation:

3. Order Irrelevance: The relational model ignores the order of tuples (rows) and attributes
(columns) in a relation. The data's meaning remains unchanged regardless of how rows or
columns are arranged.
o Tuples in a relation represent unordered sets of data.
o Columns (attributes) are logically defined, and their order in the table does not affect the
relation's semantics.

Unit: 6 – Introduction to the Relational Model 9


DCA1208: Database Management System

o This property aligns with the relational model's mathematical foundation, where sets are
inherently unordered.

Example: Consider the following two Student tables:

Although the order of rows and columns differs between Table A and Table B, the relations are
equivalent. Query results and the underlying data remain unchanged.

2.5. Schema and Instance


In the relational model, the concepts of schema and instance are crucial for understanding how data
is structured and managed over time. These two concepts define the framework and the actual data
in a database.

1. Schema: The schema is the blueprint or structure of a relation.


It defines:
1. The name of the relation.
2. The attributes (columns) within the relation.
3. The data types for each attribute.

A schema is fixed for a database unless explicitly modified. It provides the metadata describing the
table's format and constraints, serving as a guide for storing and managing data.

Key Features of a Schema:


o Static Nature: The schema is relatively stable and changes only when the structure of the
relation needs to be altered (e.g., adding or removing columns).
o Defines Constraints: Includes details such as whether an attribute can be null, uniqueness
constraints, and primary/foreign key specifications.

Unit: 6 – Introduction to the Relational Model 10


DCA1208: Database Management System

o Basis for Operations: Queries and transactions rely on the schema to understand the data
structure.

Example of a Schema: For the Student relation:

Relation Name: Student

Attributes and Data Types:


• StudentID: INTEGER (numeric values only, no decimals).
• Name: STRING (text up to a predefined limit, such as 50 characters).
• Age: INTEGER (whole numbers only, within a valid range of 18–25).
• Major: STRING (specific predefined options such as "Mathematics" or "Computer Science").
2. Instance: An instance is the actual content of a relation at a specific moment in time. The real,
dynamic data stored in the table can change as rows are inserted, updated, or deleted. The
instance is a snapshot of the data in the relation and is directly affected by user and system
operations.
Key Features of an Instance include:
o Dynamic Nature: The instance changes frequently as data is modified.
o Represents Current State: Reflects the data in the database at a given time.
o Adheres to Schema: All data in the instance must comply with the schema's structure and
constraints.

Let us see an Example of an Instance:


For the Student relation, the instance at a particular moment might look like this:

In this instance, StudentID values are unique (per the primary key constraint). Age values are valid
integers within the specified range. Major values are strings chosen from the predefined domain.

Unit: 6 – Introduction to the Relational Model 11


DCA1208: Database Management System

Schema and Instance in Querying:


When writing SQL queries, the schema determines which attributes are available and their data types,
while the instance determines the actual output of the query.

Example:

• Schema ensures that Name, Major, and Age exist as attributes.


• Instance provides the specific rows (e.g., Bob and Carol) that satisfy the condition.

Let us see a Comparison between a Schema and an Instance:

Aspect Schema Instance

Definition Structure of the relation. Actual data in the relation.

Nature Static (rarely changes). Dynamic (changes with data


operations).

Example Student (StudentID: The rows of data are currently in


INTEGER, Name: STRING) the student table.

Purpose Defines how data is Represents the current content of


organised and stored. the data.

2.6. Integrity Constraints


Integrity constraints ensure data accuracy, consistency, and reliability in a relational database. They
are crucial in maintaining data integrity by defining relationships and dependencies between
database values.

The two most fundamental types of integrity constraints are Entity Integrity and Referential
Integrity.

Unit: 6 – Introduction to the Relational Model 12


DCA1208: Database Management System

Types of integrity
constraints

Referential
Entity Integrity
Integrity

Fig.2 Types of Integrity Constraints

1. Entity Integrity: The Entity Integrity Constraint ensures that every table (relation) in a
database has a primary key, and the values of this primary key must be:
1. Unique: No two rows can have the same value for the primary key.
2. Not NULL: Every row must have a value for the primary key.

This constraint guarantees that each row is uniquely identifiable, ensuring the uniqueness of records
in the table.

Let us see an Example:

Consider the Student table:

Primary Key: The StudentID column is defined as the primary key.

o Unique: Each StudentID (101, 102, 103) is distinct.


o Not NULL: No row has a missing value for StudentID.

Violating Entity Integrity:


If we attempt to insert a row like (NULL, David, 23, Biology), it will violate the Not NULL rule, as the
primary key must not be null.

Unit: 6 – Introduction to the Relational Model 13


DCA1208: Database Management System

Another invalid row would be (101, Eve, 24, Chemistry) because the primary key value 101 already
exists in the table, violating the Unique rule.

2. Referential Integrity: The Referential Integrity Constraint ensures that a foreign key in one
table corresponds to a primary key in another table, thereby enforcing the validity of
relationships between them.

Let us see the Features:


• A foreign key in a child table must match a primary key value in the parent table.
• If a foreign key is null, the relationship does not apply to that row.

An Example:

Consider two tables: Student and Enrollment.

• Student Table:

StudentID is the primary key.

• Enrollment Table:

StudentID in the Enrollment table is the foreign key referencing the primary key in the
Student table.

Unit: 6 – Introduction to the Relational Model 14


DCA1208: Database Management System

Let us see how Referential Integrity Works:

1. Every StudentID in the Enrollment table must exist in the Student table.
o For example:
▪ StudentID = 101 in the Enrollment table exists in the Student table, so the relationship
is valid.
2. If an invalid StudentID (e.g., 104) is inserted into the Enrollment table, it would violate the
referential integrity constraint because StudentID = 104 does not exist in the Student table.

Violating Referential Integrity:


• Insert Violation:
o Attempting to insert a row like (4, 105, CS102) into the Enrollment table will fail because
StudentID = 105 does not exist in the Student table.
• Delete Violation:
o If we try to delete StudentID = 101 from the Student table, the row in the Enrollment table
referencing 101 will be left dangling. This violates referential integrity unless cascading
delete rules are applied.

2.7. Advantages of the Relational Model:


1. Simplicity and Logical Structure: The relational model organises data in a tabular format,
making it easy to understand and work with. Users can intuitively grasp the relationships
between data entities.
2. Data Integrity: Integrity constraints like primary keys (entity integrity) and foreign keys
(referential integrity) ensure data consistency and accuracy, preventing invalid or duplicate
records.
3. Ease of Data Manipulation: SQL provides a powerful and flexible way to query and manipulate
data. Complex operations like filtering, joining, and aggregating data can be performed
effortlessly.
4. Data Independence Changes in the physical storage or schema do not disrupt existing queries
or applications, ensuring seamless database evolution and maintenance.
5. Scalability and Flexibility: The relational model easily adapts to changes in data requirements
by allowing new tables and attributes to be added. It can efficiently handle growing datasets and
user demands.

Unit: 6 – Introduction to the Relational Model 15


DCA1208: Database Management System

3. TERMINOLOGIES
The relational model uses precise terminologies to describe its structure and components.
Understanding these terms is essential for effectively designing, managing, and querying relational
databases. This section expands on basic and advanced terms with examples and explanations.
• Attribute: A property or characteristic of an entity represented as a column in a table. Each
attribute describes specific data for the entity.
• Domain: The set of valid values that an attribute can hold. For example, the domain of a Gender
attribute could be {Male, Female, Other}.
• Cardinality: A table's total number of rows (tuples). It represents the count of data records in a
relation.
• Degree: A table's total number of columns (attributes). It indicates the structure or schema of
the relation.
• NULL Value: A placeholder indicating missing, unknown, or inapplicable data in a table. It is not
equivalent to zero or an empty string.

Unit: 6 – Introduction to the Relational Model 16


DCA1208: Database Management System

4. CODD'S RULES
Edgar F. Codd, the founder of the relational database model, proposed 12 rules in 1985 to evaluate a
Database Management System's (DBMS) adherence to the principles of the relational model. These
rules serve as benchmarks to assess whether a DBMS can be considered "relational" and provide a
framework for maintaining data consistency, integrity, and usability.

High-Level
View Updating
Insert, Update,
Comprehensive Rule
and Delete
Data Physical Data
Sublanguage Independence
Rule

Dynamic Logical Data


Online Catalog Independence

Systematic
Integrity
Treatment of
Independence
NULL Values

Guaranteed Distribution
Access Rule Independence

Information Non Subversion


Rule ruke

Fig.2 Codd's Rules

1. Information Rule: All data in a database must be stored in tables (relations), with rows
representing tuples and columns representing attributes. This ensures a clear and consistent
structure for data representation.

An Example: A student table with rows as student records and columns such as StudentID, Name,
Age, and Major. Non-tabular data like XML or JSON would violate this rule in its raw form.

2. Guaranteed Access Rule: All data in a relational database must be accessible using a table name,
primary key, and attribute name. There should be no ambiguous or "hidden" data.

An Example: To retrieve the student's name with StudentID = 101, use a query like.

Unit: 6 – Introduction to the Relational Model 17


DCA1208: Database Management System

3. Systematic Treatment of NULL Values: The relational model must support NULL values
representing missing or unknown information and treat them consistently. NULL values must
not affect data integrity or produce ambiguities during operations.

An Example: A NULL in the Phone column of a Contact table indicates the phone number is
unavailable or unknown.

4. Dynamic Online Catalog: Metadata (data about the database structure) must be stored in
tables, allowing it to be queried and manipulated using SQL.

An Example: Querying metadata:

This retrieves the names of all tables in the database.

5. Comprehensive Data Sublanguage Rule: The DBMS must support a comprehensive and
universal language for data operations, including data definition (DDL), manipulation (DML),
and control (DCL). SQL is the most widely used sublanguage.

An Example: A single SQL language allows:

▪ DDL: CREATE TABLE, ALTER TABLE.


▪ DML: INSERT, SELECT, UPDATE, DELETE.
▪ DCL: GRANT, REVOKE.
6. View Updating Rule: Views (virtual tables created from one or more base tables) must be
updatable if they are derived from updatable tables. This allows users to interact with complex
data structures through more straightforward representations.

An Example: If a view is created as:

Unit: 6 – Introduction to the Relational Model 18


DCA1208: Database Management System

Updating the view:

7. High-Level Insert, Update, and Delete: The DBMS must support high-level, set-based
operations for inserting, updating, and deleting data rather than operating on individual records.

An Example: Updating multiple rows in a single command:

8. Physical Data Independence: Changes to the physical storage structure or mechanisms (e.g.,
moving data files) must not affect the logical schema or applications accessing the database.

For example, moving data files from one storage device to another should not impact queries or
schema definitions.

9. Logical Data Independence: Changes to the logical schema (e.g., adding new columns) must
not disrupt existing applications or queries.

For example, Adding a GPA column to the Student table should not affect queries retrieving
StudentID or Name.

10. Integrity Independence: The DBMS must define and enforce all integrity constraints, such as
primary key, foreign key, and domain constraints, not at the application level.

An Example: Defining a primary key at the table level

11. Distribution Independence: A DBMS must provide a uniform interface for accessing data,
regardless of whether the data is stored locally or distributed across multiple locations.

An example is a query like

Unit: 6 – Introduction to the Relational Model 19


DCA1208: Database Management System

It should return consistent results whether the data is stored on a single server or distributed across
multiple nodes.

12. Non-Subversion Rule: There must be no way to bypass the relational integrity constraints
defined within the DBMS, even for low-level operations.

For example, Even administrators should be unable to insert duplicate primary key values using
backend tools or system commands.

Unit: 6 – Introduction to the Relational Model 20


DCA1208: Database Management System

5. KEYS IN THE RELATIONAL MODEL


Keys are fundamental components of the relational model, ensuring data integrity and enabling
effective data retrieval. They uniquely identify tuples (rows) in a relation (table) and help establish
relationships between different tables.

A key is an attribute or a set of attributes uniquely identifying a tuple within a relation.

5.1. Types of Keys:

Types of Keys

Primary Key Candidate Key Super Key Foreign Key Composite Key

Fig.1 Types of Keys

1. Primary Key: A Primary Key is a unique, non-null attribute (or set of attributes) uniquely
identifying each tuple in a relation. Every table must have exactly one primary key.

Rules to be followed:
o Must be unique for all tuples in the table.
o Cannot contain null values.
o There can be only one primary key per table.

Syntax:

Unit: 6 – Introduction to the Relational Model 21


DCA1208: Database Management System

Let us see an example:

EmployeeID is the primary key, ensuring each employee has a unique identifier.

2. Candidate Key: A Candidate Key is an attribute or a set of attributes that can serve as a primary
key. It satisfies the uniqueness and non-null constraints. A table can have multiple candidate
keys, but only one is chosen as the primary key.

Example Scenarios: Consider the Student table:

Candidate Keys: StudentID, Email, Phone.

Any of these attributes can uniquely identify a student, but only one is selected as the primary key.

3. Super Key: A Super Key is any attribute that uniquely identifies a tuple. It may include
additional attributes beyond what is necessary.

All candidate keys and the primary key are subsets of the super key.

Example: Consider the Employee table:

EmployeeID, {EmployeeID, Name}, and {EmployeeID, DepartmentID} are all super keys, as they
uniquely identify tuples. However, only EmployeeID is minimal and qualifies as a candidate key.

Unit: 6 – Introduction to the Relational Model 22


DCA1208: Database Management System

4. Foreign Key: A Foreign Key is an attribute in one table that refers to the primary key in another
table, establishing a relationship between the two. It enforces referential integrity, ensuring the
foreign key value exists in the parent table.

Syntax:

An Example: | Department Table |

DepartmentID in the Employee table is a foreign key referencing the DepartmentID primary key in
the Department table.

5. Composite Key: A Composite Key is a primary key of two or more attributes uniquely
identifying a tuple. It is used when a single attribute is insufficient to ensure uniqueness.

Syntax:

Unit: 6 – Introduction to the Relational Model 23


DCA1208: Database Management System

An Example:

The combination of StudentID and CourseID uniquely identifies each enrollment record.

Unit: 6 – Introduction to the Relational Model 24


DCA1208: Database Management System

6. SUMMARY
The Relational Model is a cornerstone of modern database systems, organising data into structured
relations (tables) with rows and columns. The components of a relation include attributes
(columns), tuples (rows), and domains (permissible values for attributes), which form the building
blocks of a relation. Using a practical example of a relation, we examined how entities like students
or employees can be represented in tables, while the representation of a relation involves its
mathematical notation, defining its structure and content. We also discussed the properties of a
relation, such as uniqueness, atomicity, and order irrelevance, which ensure consistent and logical
data representation. Lastly, the distinction between schema (the blueprint of a table) and instance
(the data at a specific point) was introduced to clarify relational databases' dynamic and static aspects.

Integrity constraints play a vital role in maintaining data consistency and correctness. These include
entity integrity, ensuring primary key uniqueness and non-null values, referential integrity, and
enforcing valid relationships between tables via foreign keys. The advantages of the relational
model were highlighted, such as its simplicity, scalability, data independence, and support for
powerful query languages like SQL. These features collectively make the relational model a widely
adopted standard for designing, implementing, and managing databases.

Critical terminologies in the relational model, such as attributes, tuples, domains, cardinality, and
degree, were explained, along with advanced concepts like primary, candidate, composite, and foreign
keys. These terminologies provide a foundation for understanding how relations are structured and
how relationships between tables are established. NULL values were also discussed to handle
missing or unknown information in a database.

Finally, Codd's 12 rules were reviewed as benchmarks for evaluating relational database
management systems, emphasising data independence, integrity enforcement, and comprehensive
support for SQL operations. Keys in the relational model, including primary, candidate, super,
composite, and foreign keys, were detailed to explain how they uniquely identify tuples and establish
inter-table relationships. These topics provide a comprehensive understanding of the relational
model, its structure, advantages, and principles, forming the foundation for effective database design
and management.

Unit: 6 – Introduction to the Relational Model 25


DCA1208: Database Management System

7. GLOSSARY

A table in a relational database consisting of rows (tuples) and columns

Relation - (attributes), where each row represents a record, and each column
represents a property.

An object or concept in a database with attributes to describe it, such as a


Entity - "Student" or "Employee."

A database model that organises data into tables (relations) with

Relational Model - predefined schemas, supporting operations using relational algebra and
ensuring consistency.

The smallest indivisible data unit in a database, such as a single number,


Atomic Value - string, or date, ensuring each field contains only one value.

Definitions specifying the kind of data an attribute can hold, such as


Data Types - integers, strings, dates, or floating-point numbers.

Integrity Rules that maintain the accuracy and consistency of data, such as primary
- keys, foreign keys, and unique or null constraints.
Constraint

The summarising or combining of data values, often using functions like


Aggregating Data - SUM, AVG, COUNT, MIN, or MAX.

A standard or point of reference against which a database or system's


Benchmark - performance, reliability, or capability is measured.

A subset of a programming language, like SQL, explicitly designed for


Data Sublanguage defining, querying, and manipulating data in a database.

The structural blueprint of a database, defining its tables, attributes, data


Schema types, and relationships between tables.

Unit: 6 – Introduction to the Relational Model 26


DCA1208: Database Management System

8. SELF-ASSESSMENT QUESTIONS

SELF-ASSESSMENT QUESTIONS – 1
Fill in the blanks:
1 Which of the following is NOT a component of a relation?
a) Attribute
b) Tuple
c) Domain
d) File
2 What does a tuple in a relation represent?
a) A column in the table
b) A property of the entity
c) A single row or record in the table
d) A set of permissible values for an attribute
3 Which of the following best defines a domain in a relational model?
a) The number of rows in a table
b) The number of attributes in a table
c) The set of permissible values for an attribute
d) The primary key of the table
4 What is the primary difference between schema and instance in a relational model?
a) Schema refers to the structure, and instance refers to the actual data.
b) Schema changes frequently, and instance is fixed.
c) Schema refers to rows, and instance refers to columns.
d) Schema is dynamic, and instance is static.
5 Which property of a relation ensures that all attribute values are indivisible?
a) Uniqueness
b) Cardinality
c) Atomicity
d) Integrity
6 What is the purpose of a primary key in a relation?

Unit: 6 – Introduction to the Relational Model 27


DCA1208: Database Management System

a) To allow NULL values in a table


b) To uniquely identify each tuple in a relation
c) To define the data type of attributes
d) To store multiple values in a column
7 Which of the following is an example of an integrity constraint?
a) Selecting rows based on a condition
b) Ensuring that foreign keys reference primary keys in another table
c) Sorting rows in a relation
d) Representing data in tabular format
8 Which of the following is a benefit of the relational model?
a) Complex data manipulation
b) Redundant data storage
c) Logical data independence
d) Fixed table structures
9 How is data typically represented in the relational model?
a) As a set of graphs
b) As a key-value pair
c) As a hierarchical tree structure
d) As a set of tables with rows and columns
10 Which of the following is TRUE about the relational model?
a) Tuples in a relation are always ordered.
b) A schema defines the actual data stored in the table.
c) Every table must have a primary key to ensure uniqueness.
d) Attributes in a relation can have multiple values.
11 What does the following query do?

a) Retrieves all students studying mathematics


b) Updates the major of students to mathematics
c) Deletes all students except those studying mathematics
d) Displays only the Major column of students
12 Which SQL query finds the number of rows in a relation?

Unit: 6 – Introduction to the Relational Model 28


DCA1208: Database Management System

a) SELECT * FROM relation;


b) COUNT(*)
c) SELECT COUNT(*) FROM relation;
d) SELECT SUM(*) FROM relation;
13 If the following schema is defined, what is the data type of StudentID?

a) Integer
b) String
c) Float
d) Date
14 Which of the following is one of Codd's 12 Rules for a relational database?
a) Data must be stored in a hierarchical format.
b) A foreign key must always reference a composite key.
c) Data must be stored in tables, and metadata should be accessible as tables.
d) Each table must contain at least one NULL value.
15 Which of the following keys uniquely identifies each tuple in a relation?
a) Foreign Key
b) Primary Key
c) Composite Key
d) Candidate Key

Unit: 6 – Introduction to the Relational Model 29


DCA1208: Database Management System

9. TERMINAL QUESTIONS
1. Define a relation
2. List the key components of a relation in the relational model.
3. How is a relation mathematically represented?
4. Outline the properties of a relation.
5. Differentiate between schema and instance in the relational model.
6. Define integrity constraints in the relational model.
7. Identify the advantages of the relational model.
8. Define an attribute in the relational model.
9. Discuss a domain in the relational model.
10. List the Codd’s 12 rules with examples
11. Elaborate on the keys used in a relational model
12. Design a table for an Employee database with attributes and specify the primary key.
13. Using SQL, create a table with a composite key.
14. Write a query to find all students studying 'Mathematics'.
15. Write an SQL query to retrieve all tuples from a table where the Age attribute is greater than 20.

Unit: 6 – Introduction to the Relational Model 30


DCA1208: Database Management System

10. ANSWERS

10.1. Self-Assessment Answers


1. d) File
2. c) A single row or record in the table
3. c) The set of permissible values for an attribute
4. a) Schema refers to the structure, and instance refers to the actual data.
5. c) Atomicity
6. b) To uniquely identify each tuple in a relation
7. b) Ensuring that foreign keys reference primary keys in another table
8. c) Logical data independence
9. d) As a set of tables with rows and columns
10. c) Every table must have a primary key to ensure uniqueness.
11. a) Retrieves all students studying mathematics
12. c) SELECT COUNT(*) FROM relation;
13. a) Integer
14. c) Data must be stored in tables, and metadata should be accessible as tables.
15. b) Primary Key

10.2. Terminal Answers


Answer 1: A relation is a table in the relational model that organises data into rows and columns.
Each row, called a tuple, represents a record, and each column, called an attribute, represents a record
property. A relation is defined by its schema and holds data adhering to that structure. Refer to
section 2 for more details.

Answer 2: The key components are:


• Attributes: Columns that describe the properties of data.
• Tuples: Rows that represent individual records.
• Domains: The set of permissible values for each attribute. Refer to section 2.1 for more details.

Answer 3: A relation R is represented as R (A1, A2,..., An), where A1, A2,..., An are the attributes. The
schema defines the structure, while the instance is the set of tuples representing the data. For example,
Student(StudentID, Name, Age). Refer to section 2.2 for more details.

Unit: 6 – Introduction to the Relational Model 31


DCA1208: Database Management System

Answer 4: Relations have the following properties:


• Uniqueness: No two tuples in a relation can be identical.
• Atomicity: Each attribute contains indivisible values.
• Order Irrelevance: The order of rows and columns does not affect the data. Refer to section
2.4 for more details.

Answer 5: The schema defines a table's structure, including its attributes and data types. The
instance is the actual data stored in the table at a specific moment. While the schema is static, the
instance is dynamic and changes with data operations. Refer to section 2.5 for more details.

Answer 6: Integrity constraints are rules that ensure the accuracy and consistency of data in a
database. Examples include:
• Entity Integrity: Primary keys must be unique and not null.
• Referential Integrity: Foreign keys must match primary keys in related tables. Refer to
section 2.6 for more details.

Answer 7: The relational model offers:


• Simplicity: Tabular structure is easy to understand.
• Data Integrity: Enforces constraints to maintain consistency.
• Data Independence: Logical and physical changes do not affect each other.
• Ease of Manipulation: SQL enables complex operations. Refer to section 2.7 for more details.

Answer 8: An attribute is a column in a table representing a data property. For example, student ID,
Name, and Age are attributes in a Student table. Each attribute has a specific domain defining
permissible values. Refer to section 3 for more details.

Answer 9: A domain is a set of valid values an attribute can hold. For example, the domain of an Age
attribute could be integers between 18 and 25. It ensures data consistency and validity within the
relation. Refer to section 3 for more details.

Answer 10: Information Rule: Data must be stored in tables. An example is a Student table, which
organises data. Guaranteed Access: Access data using the table name and primary key. Systematic
Null Handling: NULL values must be handled consistently. Refer to section 4 for more details.

Answer 11: Primary Key: Uniquely identifies each tuple. Example: StudentID. Foreign Key: Links
two tables. Example: DepartmentID in Employee table. Composite Key: Combines multiple

Unit: 6 – Introduction to the Relational Model 32


DCA1208: Database Management System

attributes for unique identification. Example: {StudentID, CourseID} in Enrollment. Refer to section
5 for more details.

Answer 12:

Answer 13:

Answer 14:

Answer 15:

Unit: 6 – Introduction to the Relational Model 33


DCA1208: Database Management System

11. REFERENCES
1. Fundamentals of Database Systems, Ramez Elmasri and Shamkant B. Navathe, 7th edition,
Pearson, 2016, ISBN-10: 0133970779.
2. Database Management Systems, Raghu Ramakrishnan and Johannes Gehrke,3rd Edition,
McGraw-Hill Education,2002 (for the 3rd Edition), ISBN-13: 978-0072465631
3. Database System Concepts, Abraham Silberschatz, Henry F. Korth, S. Sudarshan,7th Edition,
McGraw-Hill Education, 2019, ISBN-10: 0073523321.

Unit: 6 – Introduction to the Relational Model 34


BACHELOR OF COMPUTER APPLICATION
DCA1208: Database Management System

SEMESTER 2

DCA1208
DATABASE MANAGEMENT SYSTEM
Unit: 7 – Relational Algebra 1
DCA1208: Database Management System

Unit - 7
Relational Algebra

DCA324
KNOWLEDGE MANAGEMENT
Unit: 7 – Relational Algebra 2
DCA1208: Database Management System
TABLE OF CONTENTS

SL Fig No / Table SAQ /


Topic Page No
No / Graph Activity

1 Introduction - -
4-5
1.1 Objectives - -

2 Overview - - 6
Basic Operations of Relational Algebra 1, 2, 3, 4, 5, 6,
3 -
7, 8
3.1 Select - -
3.2 Project - -
7 – 20
3.3 Union - -
3.4 Set Difference - -
3.5 Cartesian Product - -

4 Derived Operations of Relational Algebra 9, 10, 11, 12 -


4.1 Division - -
21 - 29
4.2 Intersection - -
4.3 Join - -

5 Summary - - 30

6 Glossary - - 31

7 Self-Assessment Questions - 1 32

8 Terminal Questions - - 33 - 34

9 Answers - -
9.1 Self-Assessment Questions - - 35 – 38
9.2 Terminal Questions - -

10 References - - 39

Unit: 7 – Relational Algebra 3


DCA1208: Database Management System

1. INTRODUCTION
In the previous unit, we saw the foundational concepts of the relational model, a framework for
managing and organising data using relations, commonly known as tables. The relational model was
first introduced by Edgar F. Codd in 1970 and is based on the mathematical concept of a relation,
where data is represented in tuples (rows) within a schema (table structure). Each table consists of
columns that define the attributes of the relation, and each tuple within a table represents a unique
record. The simplicity and mathematical rigour of the relational model make it a powerful tool for
database management, ensuring data integrity and reducing redundancy.

We also explored the terminologies and key concepts integral to understanding the relational
model, including Codd's 12 rules defining a relational database. These rules emphasise the
importance of data independence, integrity, and a systematic approach to database management. The
concept of keys in the relational model was also discussed, including primary keys, which uniquely
identify tuples in a table, and foreign keys, which establish relationships between tables. Keys play a
crucial role in maintaining the relational integrity of the database by ensuring that each record can
be accurately retrieved and associated with other relevant data.

In this unit, we will learn the foundational operations of relational algebra, a critical component of
relational database management systems (RDBMS). The basic operations we'll cover include the
Select operation, which filters rows from a relation based on a specified predicate, and the Project
operation, which creates a new relation by selecting certain columns from an existing relation. We
will also explore the Union operation, which combines the results of two relations into a single
relation, and the Set Difference operation, which yields tuples present in one relation but not in
another. Finally, the Cartesian Product will be examined, where every tuple in one relation is paired
with every tuple in another, forming a new relation.

Moving beyond the basic operations, we will study the derived operations of relational algebra, built
upon the foundational concepts. These include the division operation, which is helpful for queries
that involve "all" conditions, and the intersection operation, which returns the common tuples
between two relations. Also, we will cover the Join operation, one of the most powerful and frequently
used operations in relational databases, which combines tuples from two relations based on a related

Unit: 7 – Relational Algebra 4


DCA1208: Database Management System

attribute. Together, these derived operations expand the expressive power of relational algebra,
allowing for more complex and nuanced queries to be formulated and executed within an RDBMS

1.1. Objectives
By the end of this Unit, the learners should be able to
understand:
• Understand Relational algebra and its history
• Discuss the basic operations of relational algebra
• Classify the derived Operations in relational
algebra

Unit: 7 – Relational Algebra 5


DCA1208: Database Management System

2. OVERVIEW
Relational algebra received little attention until the publication of E.F. Codd's relational model of data
in 1970. Codd proposed such an algebra as a basis for database query languages. The first query
language to be basedon Codd's algebra was ISBL, and this pioneering work has been acclaimed by
many authorities as having shown how to make Codd's idea into a functional language. Even the query
language of SQL is loosely based on relational algebra. However, the operands in SQL (tables) are not
exactly relations, and several useful theorems about relational algebra do not hold in the SQL
counterpart (arguably to the detriment of optimisers and/or users).

Relational algebra is a procedural language. It specifies the operations performed on existing


relations to derive result relations. Furthermore, it defines the complete scheme for each of the result
relations. The relational algebraic operations can be divided into basic set-oriented and relational-
oriented operations. The former are traditional set operations, while the latter are those for
performing joins, selection, projection, and division.

Unit: 7 – Relational Algebra 6


DCA1208: Database Management System

3. BASIC OPERATIONS
Relational algebra is a formal language used to query and manipulate the data stored in relational
databases. It provides a set of operations that take one or more relations (tables) as input and produce
a new relation as output. These operations are fundamental to database management and form the
theoretical foundation for most database query languages, including SQL.

The basic operations of relational algebra include:

Basic Operations

CARTESIAN
SELECT PROJECT UNION SET DIFFERENCE
PRODUCT

Fig1. Types of Operations on Relational Algebra

SELECT, PROJECT, UNION, SET DIFFERENCE, and CARTESIAN PRODUCT serve a distinct purpose in
data retrieval and manipulation.

Basic operations are the traditional set operations: union, difference, intersection, and Cartesian
product. Three basic operations- union, intersection, and difference – require that operand relations
be union-compatible. Two relations are union-compatible if they have the same parity and one-to-
one correspondence of the attributes with the corresponding attributes defined over the same
domain. The Cartesian product can be defined on any two relations. P and Q are union compatible if
both P and Q are of the same degree n and the domain of the corresponding n attributes are identical,
i.e., if P= { P1,..,Pn } and Q={ Q1,...Qn } then Dom(Pi) = Dom(Qi) for i = { 1,2...n } where Dom(Pi)
Id Name represents the domain of the attribute Pi.
101 Jones
Example 1
103 Smith
In the following examples, we utilise two relations, P and
104 Lalonde
Q, given in Figure 2. R is a computed result relation. We
107 Evan
assume that the relations P and Q in Figure 2 represent
110 Drew
112 Smith

Unit: 7 – Relational Algebra 7


DCA1208: Database Management System

employees working on developing software application packages J1 (say) and J2 (say), respectively.

Id Name
103 Smith
104 Lalonde
106 Byron
110 Drew

Fig 2: Union-compatible relations

3.1. Select Operation():


The SELECT operation chooses a subset of the tuples from a relation that satisfies a selection
condition. One can consider the SELECT operation to be a filter that keeps only those tuples that
satisfy a qualifying condition. Alternatively, we can consider the SELECT operation to restrict the
tuples in relation to only those tuples that satisfy the condition.

The SELECT operation can also be visualised as a horizontal partition of the relation into two sets of
tuples—those tuples that satisfy the condition and are selected and those tuples that do not satisfy
the condition and are discarded.

In general, the SELECT operation is denoted by:

Where the symbol σ (sigma) is used to denote the SELECT operator and the selection condition is a
Boolean expression (condition) specified on the attributes of relation R. Notice that R is generally a
relational algebra expression whose result is a relation—the simplest expression is just the name of a
database relation. The relation resulting from the SELECT operation has the same attributes as R.

The Boolean expression specified in <selection condition> is made up of a number of clauses of the
form:

<attribute name> <comparison op> <constant value>


or

<attribute name> <comparison op> <attribute name>

Unit: 7 – Relational Algebra 8


DCA1208: Database Management System

where <attribute name> is the name of an attribute of R, <comparison op> is normally one of the
operators {=, <, ≤, >, ≥, ≠}, and <constant value> is a constant value from the attribute domain. Clauses
can be connected by the standard Boolean operators and, or, and not to form a general selection
condition.

For example, to select the tuples for all employees who either work in department 4 and make over
$25,000 per year or work in department 5 and make over $30,000, we can specify the following
SELECT operation:

Notice that all the comparison operators in the set {=, <, ≤, >, ≥, ≠} can apply to attributes whose
domains are ordered values, such as numeric or date domains. Domains of strings of characters are
also considered to be ordered based on the collating sequence of the characters. If the domain of an
attribute is a set of unordered values, then only the comparison operators in the set {=, ≠} can be used.

The Characteristics are as follows:


• Unary Operation: SELECT operates on a single relation.
• Horizontal Subset: It extracts a subset of rows from the relation.
• Condition-Based: The result is determined by a logical expression or condition applied to each
row.
• No Change in Structure: The resulting relation has the same number of attributes as the
original but potentially fewer tuples (rows).

The SELECT operator is unary; it is applied to a single relation. Moreover, the selection operation is
applied to each tuple individually; hence, selection conditions cannot involve more than one tuple. The
degree of the relation resulting from a SELECT operation—its number of attributes—is the same as
the degree of R. The number of tuples in the resulting relation is always less than or equal to the
number of tuples in R. That is, |σc (R)| ≤ |R| for any condition C. The fraction of tuples selected by a
selection condition is referred to as the selectivity of the condition.

Notice that the SELECT operation is commutative; that is,

Unit: 7 – Relational Algebra 9


DCA1208: Database Management System

Hence, a sequence of SELECTs can be applied in any order. In addition, we can always combine a
cascade (or sequence) of SELECT operations into a single SELECT operation with a conjunctive (AND)
condition; that is,

In SQL, the SELECT condition is typically specified in the WHERE clause of a query.

For example, the following operation:

would correspond to the following SQL query:

SELECT *
FROM EMPLOYEE
WHERE Dno=4 AND
Salary>25000

Suppose we want to find those employees in the relation PERSONNEL of Figure 3 with an ID less than
105. This operation selects only some of the tuples in the relation. Such an operation is known asa
selection operation. The projection operation yields a vertical subset of a relation. The action is
defined over a subset of the attribute names but the tuples in the relation overall. The action is defined
over a subset of the attribute names but the tuples in the relation overall.

The selection operation, however, yields a horizontal subset of a given relation, i.e., the action is
defined over a complete set of attribute names. Still, only a subset of the tuples are included in the
result. To have a tuple included in the result relation, it must satisfy the specified selection conditions
or predicates. The selection operation is sometimes known as the restriction operation.

Results of Selection
PERSONNEL

Unit: 7 – Relational Algebra 10


DCA1208: Database Management System

Fig 3: Result of Selection over PERSONNEL for Id< 105

The selection operation may specify any finite number of predicates connected by Boolean operators.
The predicates may define acomparison between two domain-compatible attributes or between an
attribute and a constant value; if the comparison is between attribute A1 andconstant c1, then c1
belongs to Dom (A1).

Given a relation P and a predicate expression B, the selections of those tuples of relation P that satisfy
the predicate B is a relation R written as:

R= B (P)

The above expression could be read as "select those tuples t from P in which the predicate B(t) is true".
The set of tuples with R is, in this case, defined as follows:

R= { t | t to P B (t) }

3.2. Project Operation ():


The PROJECT operation in relational algebra selects specific columns (attributes) from a relation,
effectively creating a new relation that contains only the desired attributes. This operation is essential
for narrowing down the data to focus on aspects such as retrieving only the relevant information from
a table.

Key Characteristics of the PROJECT Operation:


1. Unary Operation: PROJECT operates on a single relation. Unlike operations like UNION or JOIN,
multiple relations are not required for input.
2. Vertical Subset: PROJECT selects specific columns from the relation, resulting in a vertical slice
of the original table. For instance, if a relation Employee has attributes such as EmployeeID,

Unit: 7 – Relational Algebra 11


DCA1208: Database Management System

Name, Department, and Salary, applying PROJECT to select Name and Department will yield
a new relation with just these two columns.
3. Removes Duplicates: PROJECT automatically eliminates duplicate tuples in the resulting
relation. This is because, in relational algebra, the result of any operation is a set, and sets do not
contain duplicate elements.
4. Changes Relation Structure: Depending on the projected number of columns, the resulting
relation may have fewer attributes than the original. However, the number of rows may remain
the same or decrease if duplicates are removed.
5. Attribute Renaming: In some cases, the PROJECT operation can be combined with RENAME to
give new names to the attributes in the resulting relation, improving clarity or avoiding name
conflicts in complex queries.

The general form of the PROJECT operation is:

where π (pi) is the symbol used to represent the PROJECT operation, and <attribute list> is the desired
sublist of attributes from the attributes of relation R. Again, notice that R is, in general, a relational
algebra expression whose result is a relation, which in the simplest case is just the name of a database
relation. The result of the PROJECT operation has only the attributes specified in <attribute list> in
the same order as they appear in the list. Hence, its degree equals the number of attributes in
<attribute list>.

The projection of a relation is defined as a projection of all its tuples over some set of attributes, i.e.,
it yields a vertical subset of the relation. The projection operation is used to either reduce the
number of attributes in the resultant relation or to reorder attributes. The relation's arity (or
degree) is reduced in the first case. The projection operation is shown graphically in Figure 4. The
figure below shows the projection of the relation PERSONNEL to the attribute Name. The cardinality
of the result relation is also reduced due to the deletion of duplicate tuples.

Unit: 7 – Relational Algebra 12


DCA1208: Database Management System

Id Name Name
101 Jones Jones
103 Smith Smith
104 Lalonde Lalonde
106 Byron Byron
107 Evan Evan
110 Drew Drew
112 Smith Smith
Fig 4: Projection of relation PERSONNEL over attribute Name

We defined the projection of a tuple ti over the attribute A, denoted ti[A] or A(ti), as (a), where a
is the value of tuple ti over the attribute A. Similarly, we define the projection of a relation T, denoted
by T[A] or A(ti), on the attribute A. This is defined in terms of the projection for each tuple in ti
belonging to T on the attribute A as:

T[A] = { ai | ti [A] = ai ti  T}

where T[A] is a single attribute relation and | T[A] | £ T. The cardinality T[A] may be less than the
cardinality |T| because of the deletion of any duplicates in the result. A case in point is illustrated in
Figure 4.

Similarly, we can define the projection of a relation on a set of attribute names, X, as a concatenation
of the projections for each attribute A in X for every tuple in the relation.

T[X] = { ti[A] | ti  T }

Here, A belongs to X where ti[A] represents the concatenation of all ti[A] for all A  X (A belongs to X).
Simply stated, the projection of a relation P on the set of attribute names Y belongs to P is the
projection of each tuple of the relation P on the set of attribute names Y.

Note that the projection operation reduces the arity if the number of attributes in X is less than the
arity of the relation. The projection operation may also reduce the cardinality of the result relation
since duplicate tuples are removed. (Note that the projection operation produces a relation as the
result. By definition, a relation cannot have duplicate tuples. In most commercial implementations of
the relational model, however, the duplicates would still be present in the result).

Unit: 7 – Relational Algebra 13


DCA1208: Database Management System

Example:
Imagine a relation called Employees that has the following
attributes:

Suppose you are only interested in the Name and Department of each employee. You would use the
Project operation to select these columns:
Projection Query:

Result:

Here, the Project operation π reduces the original table by selecting only the Name and Department
columns. Note that even though Alice and Dana both work in the IT department, their entries remain
distinct in this projected view since other attributes (like EmployeeID and Salary) that might
differentiate them are not considered in this operation. If you needed a unique set of names and
departments, you could incorporate a distinct operation if the relational algebra system supports it.

Unit: 7 – Relational Algebra 14


DCA1208: Database Management System

3.3. Union
The UNION operation in relational algebra combines the tuples (rows) of two relations into a single
relation. It is a set-based operation that merges two sets of tuples and automatically removes
duplicates, ensuring that the result remains a set of unique tuples.

Key Characteristics of the UNION Operation:


1. Binary Operation: UNION requires two relations as input. These relations must be compatible,
meaning they have the same number of attributes, and the corresponding attributes must have
the same data types.
2. Set-Based: UNION does not allow duplicate tuples in the resulting relation as a set-based
operation. If a tuple appears in both input relations, it will appear only once in the result.
3. Combines Data: The UNION operation merges all unique tuples from both relations. The
resulting relation contains every tuple that appears in the input relations.
4. Commutative: The UNION operation is commutative, meaning that the order of the relations
does not affect the result. For example, A UNION B will produce the same result as B UNION A.
5. Associative: The UNION operation is associative, so the grouping of operations does not affect
the result. For example, (A UNION B) UNION C is equivalent to A UNION (B UNION C).

If we assume that P and Q are two union-compatible relations, then P and Q are the set-theoretic
union of P and Q shown in Fig 5.

P: Q:

Fig 5: Projection of relation P & Q

The resultant relation, R = P U Q, has tuples drawn from P and Q such that

R = { t | P v t  Q }

Unit: 7 – Relational Algebra 15


DCA1208: Database Management System

The result relation R contains tuples in either P or Q or both. The duplicate tuples are eliminated.
Remember that from our definition of union compatibility, the degree of the relations P and R is the
same. The cardinality of the resultant relation depends on the duplication of tuples in P and Q. From
the above expression, we can see that if all the tuples in Q were contained in P, then |R| = |P| and R=
P, while if the tuples in P and Q were disjoint, then |R| = |P|+|Q|.

Example 2
R, the union of P and Q, is given in the figure below. R represents employees working on the packages
J1 or J2, or both. Since a relation does not have duplicate tuples, an employee working on both J1 and
J2 will appear in the relation R only once.

R:

Id Name
101 Jones
103 Smith
104 Lalonde
106 Byron
107 Evan
110 Drew
112 Smith
Fig 6.: P Q

3.4. Set Difference ( - )


The SET DIFFERENCE operation in relational algebra finds the difference between two relations. It
retrieves the tuples present in the first relation but not in the second. This operation is particularly
useful for identifying data unique to one relation compared to another.

Key Characteristics of the SET DIFFERENCE Operation:


1. Binary Operation: SET DIFFERENCE requires two relations as input. These relations must be
compatible, meaning they must have the same number of attributes, and the corresponding
attributes must have the same data types.

Unit: 7 – Relational Algebra 16


DCA1208: Database Management System

2. Set-Based: Like other set operations in relational algebra, SET DIFFERENCE operates on sets of
tuples. The result is a set of tuples in the first relation but not the second.
3. Asymmetric: The SET DIFFERENCE operation is not commutative, meaning that the order of
the relations matters.
For example, A − B will not yield the same result as B − A.
4. Finds Unique Tuples: The resulting relation contains only those unique tuples to the first
relation. It effectively filters out any tuples that are also present in the second relation.
5. Compatibility Requirement: Both relations involved in the SET DIFFERENCE operation must
be union-compatible, meaning they must have the same structure (same number of attributes
and corresponding data types).

Example 3
R, the result of P-Q, gives employees working only on package J1. Employees working on both
packages J1 and J2 have been removed.

R:

Id Name
101 Jones
107 Evan
112 Smith
Fig7. P-Q

The difference operation removes common tuples from the first relation.

R = P-Q such that, R = { t | t  P  t / Q

3.5. Cartesian Product


The CARTESIAN PRODUCT (the cross-product) is a fundamental operation in relational algebra that
combines tuples from two relations in all possible ways. The result is a new relation where each tuple
from the first is paired with every tuple from the second. This operation is often an intermediate step
in more complex queries, particularly when performing join operations.

It is also known as CROSS JOIN—which is denoted by X.

Unit: 7 – Relational Algebra 17


DCA1208: Database Management System

Key Characteristics of the Cartesian Product Operation:


1. Binary Operation: The CARTESIAN PRODUCT requires two relations as input. It is a binary
operation that produces a new relation.
2. Pairwise Combination: The operation combines each tuple from the first relation with every
tuple from the second relation, resulting in all possible combinations of tuples.
3. Resulting Relation: The resulting relation has attributes from both input relations. If the first
relation has m attributes and n tuples, and the second relation has p attributes and q tuples, the
resulting relation will have m + p attributes and n × q tuples.
4. Large Output: Because it generates all possible combinations, the CARTESIAN PRODUCT can
produce a large relation, especially if the input relations have many tuples. If not handled
properly, this can lead to inefficiencies.
5. Foundation for Joins: Although the CARTESIAN PRODUCT itself may not yield meaningful
results; it is often used as the basis for more meaningful operations like joins. By applying a
SELECT operation after the CARTESIAN PRODUCT, you can filter the combined tuples to match
specific criteria, effectively performing a join between the relations.

The extended Cartesian, or simply the Cartesian product of two relations, is the concatenation of
tuples belonging to the two relations. A new resultant relation scheme is created consisting of all
possible combinations of the tuples.

R=PxQ

where a tuple r R is given by { t1|| t2 | t1  P  t2  Q }, i.e. the result relation is obtained by


concatenating each tuple in relation P with each tuple in relation Q. Here, we represent the
concatenation operation.

The scheme of the result relation is given by R = P || Q

The degree of the result relation is given by: |R| = |P| + |Q|

The cardinality of the result relation is given by: |R| = |P|*|Q|

Example 5
The Cartesian product of the PERSONNEL relation and SOFTWARE_PACKAGE relations of Figure 8(a)
is shown in Figure 8(b). Note that the relations P and Q from Figure 2 of Example 1 are a subset of the
PERSONNEL relation.

Unit: 7 – Relational Algebra 18


DCA1208: Database Management System

PERSONNEL:

Id Name
101 Jones
103 Smith
104 Lalonde
107 Evan
110 Drew
112 Smith
Fig 8. Software Packages:

S
J1
J2
Fig.8(a)

PERSONNEL:

Id P. Name S
101 Jones J1
101 Jones J2
103 Smith J1
103 Smith J2
104 Lalonde J1
104 Lalonde J2
106 Byron J1
106 Byron J2
107 Evan J1
107 Evan J2
110 Drew J1
110 Drew J2
112 Smith J1

Unit: 7 – Relational Algebra 19


DCA1208: Database Management System

112 Smith J2
Fig.8(b)

Fig 8: (a) PERSONNEL (Emp#, Name) and SOFTWARE_PACKAGE(S) represent employees and
software packages, respectively; (b) the Cartesianproduct of PERSONNEL and
SOFTWARE_PACKAGES.

The union and intersection operations are associative and commutative;therefore, given relations
R, S, T:

R  (S  T) = (R  S)  T = (S  R)  T = T  (S  R)=...R (S  T) =(R  S)  T=...


The difference between operations, in general, is that they are non-commutative and non-associative.

R-S  S-R non-commutative


R-(S-T)  (R-S)-T non-associative

Unit: 7 – Relational Algebra 20


DCA1208: Database Management System

4. DERIVED OPERATIONS IN RELATIONAL ALGEBRA


Derived Operations, also known as extended operations, can be derived from basic operations and
are hence named Derived Operations.

Derived operations enhance the expressiveness and efficiency of relational queries by allowing more
intuitive and straightforward formulations of complex data retrieval tasks. They are built on the
fundamental operations but provide additional functionality and flexibility, making it easier to handle
various database operations without repeatedly combining the basic operations.

Three derived operations are:

Derived
Operations

Division Intersection Join

Fig.9 Types of Derived Operations

Division operations, Intersection operations, and Join Operations.

4.1. Division()
The DIVISION operation is one of the derived operations in relational algebra. It retrieves tuples from
one relation that are associated with all tuples in another relation. It is particularly useful when
dealing with queries that require identifying entities related to a comprehensive set of values, such
as finding all students who have taken all the courses offered in a department.

The DIVISION operation, denoted by ÷, is helpful for a special query that sometimes occurs in
database applications.

Unit: 7 – Relational Algebra 21


DCA1208: Database Management System

Characteristics of the DIVISION Operation:


1. Binary Operation: DIVISION requires two relations as input. The first relation (often called the
dividend) should contain all the attributes of the second relation (the divisor) plus possibly
some others.
2. Input Relations: The second relation (divisor) usually contains a single attribute or a set of
attributes that the operation is based upon. In contrast, the first relation (dividend) contains this
set of attributes and one or more additional attributes.
3. Resulting Relation: The result of the DIVISION operation is a relation that contains the
attributes of the dividend that are not in the divisor. The tuples in this resulting relation
represent those entities from the dividend associated with every tuple in the divisor.
4. Use Case: DIVISION is often used when we need to query for tuples with a certain relationship
with all elements of another set. For example, it is important to find all suppliers who supply all
parts listed in a particular catalogue.

Let us consider an example 8:


Given the relation P and Q, as shown in Figure 10(a), the result of dividing P by Q is the relation R,
and it has two tuples. For each tuple in R, its product with the tuples of Q must be in P. In our example
(a1,b1) and (a2,b2) must both be tuples in P; the same is true for (a5,b1) and (a5,b2).

Simply stated, the cartesian product of Q and R is a subset of P. In Figure 10(b), the result relation R
has four tuples; the cartesian product of R and Q gives a resulting relation, again a subset of P. In
Figure 10(c), since there are no tuples in P with a value b3 for the attribute B (i.e., selecting B=b3(P) =
0), we have an empty relation R, which has a cardinality of zero.
P: Q: R (result): Q: then R is:

Unit: 7 – Relational Algebra 22


DCA1208: Database Management System

Q: then R is:

(a) then R is: (b)


Q:

Fig 10: Examples of the division operation. (a) R = P / Q; (b) R = P / Q(P is same in part i); (c)
R=P/Q (P is same as in part i); (d) R= P/Q (P is same as in part i)

In Figure 10(d), the relation Q is empty. The result relation can be definedas the projection of P on
the attributes in P-Q. However, it is usual to disallow division by an empty relation.

Finally, if relation P is an empty relation, then relation R is also an empty relation. Let us treat the Q
as representing one set of properties (the properties are defined on the Q, each tuple in Q
representing an instance of these properties) and the relation r as representing entities with these
properties (entities are defined on P-Q, and the properties are, as before, defined on Q); note that P
Q must be equal to P. Each tuple in P represents an object with some given property.

The resultant relation R is the set of entities that possesses all the properties specified in Q. The two
entities a1 and a5 possess all the properties, i.e., b1 and b2. The other entities in P, a2, a3, and a4, only
possess one, not both, of the properties. The divisionoperation is useful when a query involves the
phrase "for all objects having all the specified properties." Note that both P-Q and Q generally
representa set of attributes. It should be clear that Q is not a subset of P.

4.2. Intersection (∩)


The INTERSECTION operation in relational algebra is a derived operation used to find common
tuples between two relations. It returns a new relation containing only the tuples present in both
input relations. INTERSECTION is useful when you must identify overlapping data between two
datasets, such as finding records that satisfy multiple criteria simultaneously.

Unit: 7 – Relational Algebra 23


DCA1208: Database Management System

Key Characteristics of the INTERSECTION Operation:


1. Binary Operation: INTERSECTION requires two relations as input. Both relations must be
compatible, meaning they must have the same number of attributes, and the corresponding
attributes must have the same data types.
2. Set-Based: Like other set operations, INTERSECTION operates on sets of tuples. The result is a
set of tuples common to both input relations, with duplicates automatically removed.
3. Compatibility Requirement: Both relations involved in the INTERSECTION operation must be
union-compatible. This means they must have identical structures for the operation to be
performed.
4. Derived from Basic Operations: Although INTERSECTION is not a basic operation in relational
algebra, it can be derived using the SET DIFFERENCE operation. Specifically, the INTERSECTION
of two relations, A and B, can be defined as A ∩ B = A − (A − B).

The intersection operation selects the common tuples from the two relations.

R = P  Q where R = {t |  t P  t  Q}
Example 4
The resultant relation P Q is the set of all employees working on both packages. (figure 8.2(c) of
example 2).

The intersection operation is unnecessary. It can be very simplyexpressed as:

P  Q= P-(P-Q)
It is, however, more convenient to write an expression with a single intersection operation than one
involving a pair of difference operations.

Note that in these examples, the operand and the result relation schemes, including the attribute
names, are identical, i.e. P=Q=R. If the attribute names of compatible relations are not identical, the
naming of the attributes of the result relation will have to be resolved.

4.3. JOIN (⨝)


The JOIN operation is one of relational algebra's most important derived operations. It combines
related tuples from two different relations based on a common attribute, effectively merging them
into a single relation. The JOIN operation is fundamental in relational databases because it allows the
integration of data spread across multiple tables, enabling more complex and meaningful queries.

Unit: 7 – Relational Algebra 24


DCA1208: Database Management System

Key Characteristics of the JOIN Operation:


1. Combines Relations: JOIN takes two relations as input and produces a single relation by
combining tuples that satisfy a specific condition, often involving matching values in
corresponding attributes.
2. Condition-Based: The most common form of JOIN is the equi-join, where the join condition is
based on the equality of attribute values in the two relations. Other types of joins include natural
join (an equijoin that automatically eliminates duplicate attributes) and theta joins (where the
condition can be any comparison operator, not just equality).
3. Common Attributes: JOIN typically operates on relations with one or more common attributes.
These attributes match tuples from the different relations, effectively "joining" related data into
a single tuple in the result.
4. Resulting Relation: The resulting relation contains attributes from both input relations.
Depending on the type of join, it may include all or only some of the tuples from the input
relations.

As the name suggests, the join operator allows the combining of two relations to form a new one. The
tuples from the operand relations that participate in the operation and contribute to the result are
related. The join operation allows the processing of relationships existing between the operand
relations.

Example 6
In Figure 11, we encounter the following relations:
ASSIGNMENT (Emp#, Prod#, Job#)

JOB_FUNCTION (Job#, Title)

EMPLOYEE PRODUCT:

Fig.11a)

Unit: 7 – Relational Algebra 25


DCA1208: Database Management System

JOB-FUNCTION ASSIGNMENT

Fig.11b)

Fig 11: (a) Relation schemes for employee roles in development teams, b) Sample relation

Suppose we want to respond to the query “Get product number of assignments whose development
teams have a chief programmer”. This requires first computing the Cartesian product of the
ASSIGNMENT and JOB_FUNCTION relations. Let us name this product relation TEMP. This is followed
by selecting those tuples of TEMP where the attribute Title has the value chief programmer and the
value of the attribute Job# in ASSIGNMENT and JOB_FUNCTION are the same. The required result,
shown below, is obtained by projecting these tuples to the attribute Prod#. The operations are
specified below.

TEMP = (ASSIGNMENT X JOB_FUNCTION)


Prod#( Title = 'chief programmer'  ASSIGNMENT.Job# (TEMP))
Prod#
HEAP1
BINS9
In another method of responding to this query, we can first select those tuples from the
JOB_FUNCTION relation so that the value of the attribute Title is a chief programmer. Let us call this
set of tuples the relation TEMP1. We then compute the Cartesian product of TEMP1 and ASSIGNMENT,
calling the product TEMP2. This is followed by a projection on Prod# over TEMP2 to give us the
required response. These operations are specified below:

TEMP1 = ( Title = 'chief programmer' (JOB_FUNCTION))


TEMP2 = ( ASSIGNMENT.Job# = JOB_FUNCTION.Job# (ASSIGNMENTX TEMP1))
Prod# (TEMP2) gives the required result.

Unit: 7 – Relational Algebra 26


DCA1208: Database Management System

Notice that in the selection operation that follows the Cartesian product, we take only those tuples
where the value of the attributes is ASSIGNMENT.Job#and JOB_FUNCTION.Job# are the same. These
combined operations of the cartesian product followed by selection are the join operations. Note that
we have qualified the identically named attributes by the name of the corresponding relation to
distinguish them.

If a relation is joined with itself, we would need to rename either the attributes of one of the copies of
the relation or the relation’s name itself. We illustrate this in example 7.

The join condition may have more than one term, necessitating using the subscript in the comparison
operator. Now, we shall define thedifferent types of join operations.

In these discussions, we use P, Q, R, and so on to represent both the relation scheme and the collection
or bag of underlying domains of the attributes. We call it a bag of domains because multiple attributes
may be defined on the same domain.

Typically, P С Q may be null, guaranteeing the uniqueness of attribute names in the result relation.
When the same attribute name occurs in the two schemes, we use qualified names.

Two common and useful variants of the join are the equi-join and the natural join. In the equi-join,
the comparison operator theta (i=1,2,...,n) is always the equality operator (=). Similarly, the
comparison operator is always the equality operator in the natural join. However, only one of the two
sets of domain-compatible attributes involved in the natural join is Ai from P and Bi from Q, for i =
1,...n, the natural join predicate is a conjunction of terms of the following form:

( t1[Ai] = t2[Bi] ) for i = 1,2...n

Domain compatibility requires that the domains of Ai and Bi be compatible, and for this reason,
relation schemes P and Q have attributes defined on common domains, i.e., P С Q¹. Therefore, join
attributes have common domains in the relation schemes P and Q. Consequently, only one set of the
join attributes on these domains must be preserved in theresult relation. This is achieved by taking
a projection after the join operation, eliminating duplicate attributes. If the relations P and Q have
attributes with the same domain but different attribute names, then renaming or projection may be
specified.

Unit: 7 – Relational Algebra 27


DCA1208: Database Management System

Example 7
Given the EMPLOYEE and SALARY relations of Figure 12(i), if we must find the salary of employees
by name, we join the tuples in the relation EMPLOYEE with those in SALARY such that the value of
the attribute ID in EMPLOYEE is the same as that in SALARY. The natural join takes the predicate
expression to be EMPLOYEE.Id = SALARY.Id. The result of the natural join is shown in Figure 12(ii).
We do not need to specify this predicate when using the natural join. The expression to specify the
operation of finding employees' salary by name is as follows. Here, we project the result of the natural
join operation on the attributes Name and Salary:

(Name. Salary)(EMPLOYEE |x| SALARY)

EMPLOYEE |X| SALARY

Id Name Salary
101 Jones 67
103 Smith 55
104 Lalonde 75
107 Evan 80

ASSIGNMENT.Emp# COASSIGN.Emp#
107 107
107 101
107 103
101 107
101 101
101 103

Unit: 7 – Relational Algebra 28


DCA1208: Database Management System

110 101
103 107
103 101
103 103
101 110
Fig 12: i) The natural join of EMPLOYEE and SALARY relations; ii) The joint of ASSIGNMENT
with the renamed copy

In relational algebra, joins are crucial operations that combine tuples from two relations based on
a specified condition. Here’s a rundown of the main types of joins:
1. Natural Join (⨝): This join pairs tuples from two relations that have the same value in all
attributes with common names. It automatically eliminates duplicate columns.
2. Theta Join (⨝_θ): This join combines tuples from two relations based on a specific condition
(θ) that involves attributes of the relations. It’s more flexible than a natural join because it
allows for conditions other than equality.
3. Equi Join: A special case of the theta join where the condition is strictly equality between
attributes. It's often used interchangeably with natural join when the context is clear.
4. Semi Join (⋉ or ⋊): This join returns tuples from the first relation for which there is at least
one matching tuple in the second relation, but the result includes only the attributes of the first
relation. It’s useful for queries where you only need to verify the existence of related tuples in
another relation without returning their values.

Unit: 7 – Relational Algebra 29


DCA1208: Database Management System

5. SUMMARY
Relational Algebra is a fundamental concept in database theory, providing the mathematical
framework for manipulating and querying data stored in relational databases. It consists of a set of
basic and derived operations that can be combined to perform complex queries. The basic operations
in Relational Algebra include Select, Project, Union, Set Difference, and Cartesian Product. These
operations are crucial for filtering data, selecting specific columns, combining datasets, and
generating all possible combinations of tuples from two relations.

The Select operation filters rows from a relation based on a given predicate, retrieving only those
tuples that satisfy the condition. This operation is crucial for narrowing down large datasets to more
relevant subsets of data. The Project operation, on the other hand, selects specific columns from a
relation, effectively reducing the number of attributes in the result set. Combining the Select and
Project operations allows users to create powerful queries that retrieve only the necessary data in a
specific format.

The Union operation in Relational Algebra combines the tuples of two relations, eliminating
duplicates and generating a relation that contains all tuples from both input relations. The Set
Difference operation finds tuples present in one relation but not in the other, making it useful for
identifying distinct elements. The Cartesian Product operation generates all possible pairs of tuples
from two relations, often used as an intermediate step in more complex queries, such as Joins.

In addition to the basic operations, Relational Algebra includes several derived operations that
provide more advanced data manipulation capabilities. The division operation finds tuples in one
relation that are associated with all tuples in another. The Intersection operation returns only the
tuples common to both input relations.

Finally, the Join operation is a powerful tool for combining related tuples from different relations
based on a specified condition, enabling complex queries across multiple tables. Together, these
operations form the backbone of query processing in relational databases, allowing users to extract
meaningful information from structured data.

Unit: 7 – Relational Algebra 30


DCA1208: Database Management System

6. GLOSSARY

A procedural query language that provides a set of operations to


Relational
- manipulate and retrieve data from relational databases using
Algebra mathematical principles.

An operation that returns all possible pairs of tuples from two relations,

Cartesian Product - forming a relation that combines each tuple from the first relation with
every tuple from the second.

Boolean A logical statement that can be evaluated as either true or false, often used
- in filtering conditions in queries and algorithms.
Expression

Degree of the The number of attributes (columns) in a relation, representing the


- structure or arity of the table.
Relation

Arity of the Another term for the degree of a relation, indicating the number of
- attributes or columns in a relational table.
Relation

A type of join operation that combines tuples from two relations based on
Equi-Join - an equality condition between specified attributes.

Unit: 7 – Relational Algebra 31


DCA1208: Database Management System

7. SELF-ASSESSMENT QUESTIONS

SELF-ASSESSMENT QUESTIONS – 1
Fill in the blanks:
1 In Relational Algebra, the Basic operations are the traditional ________________ orations.
2 The Cartesian product can be defined on any __________________relations.
3 If we assume that P and Q are two union-compatible relations, then the union of P and Q is set-
theoretic __________________of P and Q.
4 The difference operation removes _________________tuples from the first relation.
5 The operation selects the common tuples from the two relations.
6 The Cartesian product of two relations is the ___________________of tuples belonging to the two
relations.
7 The union and intersection operations are associative and __________________.
8 The _________________operation, in general, is non-commutative and non- associative.
9 The projection of a relation yields a ______________________subset of the relation.
10 The projection operation may _________the cardinality of the result relation since duplicate
tuples are removed.
11 The selection operation yields a _____________________subset of a given relation.
12 The ______________operator allows the combining of two relations to form a single new relation.
13 The join operation allows the processing of ____________________existing between the operand
relations.
14 Two common and useful variants of the join are the equi-join and the _______________.
15 In the _______________ the comparison operator theta (i=1,2,...,n) is always the equality operator
(=).
16 In the natural join, the _______________ operator is always the equality operator.

Unit: 7 – Relational Algebra 32


DCA1208: Database Management System

8. TERMINAL QUESTIONS
1. Classify the types of operations in relational Algebra.
2. Explain the select operation.
3. Identify the characteristics of select operations in relational algebra.
4. List the Key Characteristics of the PROJECT Operation.
5. Explain the UNION operation with an example.
6. Define the SET DIFFERENCE operation.
7. What are the Key Characteristics of the SET DIFFERENCE Operation?
8. Discuss the CARTESIAN PRODUCT with an example.
9. Classify the derived operations in relational algebra.
10. Compare and contrast the Intersection vs. Join operations in relational algebra.
11. Given two relations R1 and R2, where R1 contains N1 tuples, R2 contains N2 tuples, and N2 > N1
> 0, give the minimum and maximum possible sizes (in tuples) for the result relation produced
by each of the following relational algebra expressions. In each case, state any assumptions
about the schemas for R1 and R2 that are needed to makethe expression meaningful:

R1 υ R2
R1 ∩ R2
R1− R2
R1 x R2
 a=5 (R1)
a (R1)
R1/R2
12. Consider the following schema:
Suppliers(sid: integer, sname: string, address: string)
Parts(pid: integer, pname: string, color: string)
Catalog(sid: integer, pid: integer, cost: real)

The key fields are underlined, and the domain of each field is listed after the field name. Thus, sid is
the key for Suppliers, pid is the key for Parts, and sidand pid together form the key for the Catalog. The
Catalog relation lists the prices charged for parts by Suppliers. Write the following queries in
relational algebra.

Unit: 7 – Relational Algebra 33


DCA1208: Database Management System

1. Find the names of suppliers who supply some red parts.


2. Find the sids of suppliers who supply some red or green parts.
3. Find the sids of suppliers who supply some red parts or are at 221Packer Ave.
4. Find the sids of suppliers who supply some red parts and some green parts.
5. Find the sids of suppliers who supply every part.
6. Find the sids of suppliers who supply every red part.
7. Find the sids of suppliers who supply every red or green part.
8. Find the sids of suppliers who supply every red part or supply everygreen part.

Unit: 7 – Relational Algebra 34


DCA1208: Database Management System

9. ANSWERS

9.1. Self-Assessment Answers


1. Set
2. Two
3. Union
4. Common
5. Intersection
6. Concatenation
7. Commutative Difference
8. Join
9. Vertical
10. Reduce
11. Horizontal
12. Join
13. Relationships
14. natural join
15. Equi-join
16. comparison

9.2. Terminal Answers


Answer 1: Operations in Relational Algebra are classified into two main types: Basic Operations and
Derived Operations.
Basic Operations: These include Select, Project, Union, Set Difference, and Cartesian Product. These
operations are fundamental and serve as the building blocks for query processing.

Derived Operations: These include operations like Division, Intersection, and Join. Derived
operations are combinations or extensions of basic operations used to perform more complex queries.
Refer to section 3 for details.

Answer 2: The Select operation in Relational Algebra is used to filter rows from a relation based on
a specific condition or predicate. The operation is denoted by the Greek letter sigma (σ) followed by

Unit: 7 – Relational Algebra 35


DCA1208: Database Management System

the predicate in parentheses. It retrieves all tuples from a relation that satisfies the given condition,
effectively narrowing down the dataset to only the relevant data.
Example: Suppose we have a relationship between students with attributes such as name, age, and
grade. If we want to select students who are older than 18, we will use the Select operation as follows:
σAge>18(Students)\sigma_{Age > 18}(\text{Students})σAge>18(Students) This operation would
return all tuples (rows) from the Students relation where the Age attribute is greater than 18. Refer
to section 3.1 for details.

Answer 3: The Select operation has several key characteristics:


• Unary Operation: It operates on a single relation.
• Condition-Based: It filters tuples based on a specified condition or predicate.
• Non-Destructive: The original relation is not modified; a new relation is produced.
• Subset Creation: It returns a subset of the original relation's tuples.
• Order Independence: The order of tuples in the result is not guaranteed to be the same as in
the original relation. Refer to section 3.1 for details.

Answer 4: The Project operation has the following key characteristics:


• Unary Operation: Operates on a single relation.
• Column Selection: Select specific columns (attributes) from the relation.
• Eliminates Duplicates: The result is a set, so duplicate tuples are removed.
• Reduces Arity: It reduces the number of attributes in the resulting relation.
• Non-Destructive: Does not alter the original relation but creates a new one with the selected
attributes. Refer to section 3.2 for details.

Answer 5: The Union operation in Relational Algebra combines the tuples of two relations into a
single relation, eliminating any duplicate tuples. The two relations must be union-compatible,
meaning they must have the same number of attributes, and the corresponding attributes must have
the same domain.
Example: Suppose we have two relations, A and B, with the same attributes:

A: { (1, 'John'), (2, 'Alice') }

B: { (2, 'Alice'), (3, 'Bob') }

The Union operation would be: A∪BA \cup BA∪B Result: { (1, 'John'), (2, 'Alice'), (3, 'Bob') } Refer to
section 3.3 for details.

Unit: 7 – Relational Algebra 36


DCA1208: Database Management System

Answer 6: The Set Difference operation in Relational Algebra returns the tuples that are present in
one relation but not in another. The operation is denoted by the minus sign (-). Like the Union
operation, the two relations involved must be union-compatible.

Example: If we have two relations, A and B:

A: { (1, 'John'), (2, 'Alice') }

B: { (2, 'Alice'), (3, 'Bob') }

The Set Difference operation A−BA - BA−B would result in Result: { (1, 'John') } Refer to section 3.4
for details.

Answer 7: The Set Difference operation has the following key characteristics:
• Binary Operation: Requires two relations as input.
• Union Compatibility: Both input relations must be union-compatible.
• Result Subtraction: Returns tuples from the first relation that are not present in the second
relation.
• Non-Destructive: Does not modify the original relations but produces a new relation.
• Asymmetric: The operation A−BA - BA−B is not the same as B−AB - AB−A. Refer to section 3.4
for details.

Answer 8: The Cartesian Product operation, denoted by a cross (×), combines two relations by
pairing every tuple from the first relation with every tuple from the second relation. The result is a
relation that contains all possible combinations of tuples from the two input relations.
Example: Suppose we have two relations, R and S:

R: { (1, 'John'), (2, 'Alice') }

S: { ('X', 'New York'), ('Y', 'Los Angeles') }

The Cartesian Product R×SR \times SR×S would result in: Result: { (1, 'John', 'X', 'New York'), (1, 'John',
'Y', 'Los Angeles'), (2, 'Alice', 'X', 'New York'), (2, 'Alice', 'Y', 'Los Angeles') } Refer to section 3.5 for
detail.

Answer 9: Derived operations in Relational Algebra include:


• Division: Used to find tuples in one relation that are associated with all tuples in another
relation.

Unit: 7 – Relational Algebra 37


DCA1208: Database Management System

• Intersection: Returns only the tuples that are common to both input relations.
• Join: Combines related tuples from different relations based on a specified condition. Refer to
section 4 for details.

Answer 10: Intersection and Join operations serve different purposes:


Intersection: It is a set-based operation that returns the common tuples between two relations. The
relations must be union-compatible, and the result is simply the common tuples.

Join: This operation combines tuples from two relations based on a specific condition. It is more
versatile than Intersection, as it can link tuples from different relations using a key or a condition.
There are various types of Joins, such as Inner Join, Outer Join, and Natural Join, each serving different
needs in query processing. Refer to section 4.2 for details.

Answer 11: Refer to section 2 for details.

Answer 12: Refer to sections 2 and 3 for detail.

Unit: 7 – Relational Algebra 38


DCA1208: Database Management System

10. REFERENCES
1. Fundamentals of Database Systems, Ramez Elmasri and Shamkant B. Navathe, 7th edition,
Pearson, 2016, ISBN-10: 0133970779.
2. Database Management Systems, Raghu Ramakrishnan and Johannes Gehrke,3rd Edition,
McGraw-Hill Education,2002 (for the 3rd Edition), ISBN-13: 978-0072465631
3. Database System Concepts, Abraham Silberschatz, Henry F. Korth, S. Sudarshan,7th Edition,
McGraw-Hill Education, 2019, ISBN-10: 0073523321

Unit: 7 – Relational Algebra 39


DCA1208 : Database Management System
BACHELOR OF COMPUTER APPLICATION
SEMESTER 2

DCA1208
DATABASE MANAGEMENT SYSTEM
Unit: 8 - Introduction to SQL 1
DCA1208 : Database Management System

Unit - 8
Introduction to SQL

DCA324
KNOWLEDGE MANAGEMENT
Unit: 8 - Introduction to SQL 2
DCA1208 : Database Management System

TABLE OF CONTENTS

SL Fig No / Table SAQ /


Topic Page No
No / Graph Activity

1 Introduction - -
4-5
1.1 Objectives - -

2 SQL Overview - -

2.1 Definition - -
6-8
2.2 Process 1 -

2.3 Features - -

3 SQL Syntax - - 9-10

4 Categories of SQL Commands 2 - 11-13

5 Data Types - - 14-15

6 Basic SQL Commands - -

6.1 DDL Commands - -

6.2 DML Commands - - 16-23


6.3 DCL Commands - -

6.4 TCL Commands - -

7 Summary - - 24-25

8 Glossary - - 26

9 Self-Assessment Questions - - 27-30

10 Terminal Questions - - 31

11 Answers - - 32-35

12 References - - 36

Unit: 8 - Introduction to SQL 3


DCA1208 : Database Management System

1. INTRODUCTION
In the previous unit, we studied the Basic Operations of Relational Algebra, which forms the
foundation for manipulating relational database data. We saw how the Select operation works for
filtering rows based on conditions, the Project operation for retrieving specific columns, the Union
operation for combining results from two relations, the Set Difference operation for identifying rows
present in one relation but not the other, and the Cartesian Product for pairing rows from two
relations.

Additionally, we looked into the Derived Operations of Relational Algebra, including Division for
queries involving "all" conditions, Intersection for finding standard rows between relations, and Join
for combining related tuples from two relations based on a condition, thereby providing a robust
toolkit for database query formulation.

In the current unit, we will begin by exploring an SQL Overview and its definition as the standard
language for managing and interacting with relational databases. We'll understand the SQL process,
which involves writing, executing, and retrieving results from queries on database servers.
Additionally, we'll examine the key features of SQL, such as its simplicity, flexibility, and compatibility
across different database systems. Moving on to SQL Syntax, we'll study the rules and structure of
writing SQL queries, including using keywords, clauses, operators, and statements to perform various
database operations effectively.

We will also classify and explore the Categories of SQL Commands, which include Data Definition
Language (DDL) for defining database structures, Data Manipulation Language (DML) for modifying
data, Data Control Language (DCL) for managing permissions, and Transaction Control Language
(TCL) for handling transactions. Furthermore, we will study Data Types to understand how SQL
defines the kind of data stored in database tables, such as integers, strings, and dates. Finally, we will
analyse the Basic SQL Commands within these categories, examining their syntax and practical usage.
Topics include DDL commands like CREATE and ALTER, DML commands like INSERT and UPDATE,
DCL commands like GRANT and REVOKE, and TCL commands like COMMIT and ROLLBACK,
providing a comprehensive understanding of database management.

Unit: 8 - Introduction to SQL 4


DCA1208 : Database Management System

1.1 Objectives
After studying this unit, you should be able to:

• Define SQL
• Describe the SQL Process
• List the Features of SQL
• Explain the SQL syntaxes
• Classify the Categories of SQL Commands

Unit: 8 - Introduction to SQL 5


DCA1208 : Database Management System

2. SQL (STRUCTURED QUERY LANGUAGE)

2.1 Definition
SQL (Structured Query Language) is a standardised programming language for managing and
manipulating relational databases. In a Database Management System (DBMS), SQL serves as the
primary language for Data Retrieval, Manipulation, Definition, and Control.

SQL is widely used in DBMSs due to its high level of abstraction, making it easy to use and understand.
It allows users to perform complex queries, manage data relationships, and enforce data integrity
with a simple syntax.

2.2 SQL Process


The figure illustrates the various stages and components of processing an SQL query, from input to
accessing the data in the physical database. Each element has a specific role to ensure the SQL query
is efficiently and accurately executed.

Figure 1: SQL Process

SQL Query: The process starts with an SQL query written by the user, which could be for data
retrieval (e.g., SELECT), updating, or modifying the database.

Query Language Processor: The SQL query is sent to the Query Language Processor. This
component translates the high-level SQL query into an intermediate representation that the DBMS
can understand.

Unit: 8 - Introduction to SQL 6


DCA1208 : Database Management System

Parser + Optimiser: Within the Query Language Processor, the Parser checks the SQL query syntax
to ensure it is correct. After parsing, the Optimiser takes the query and determines the most efficient
way to execute it, considering factors like available indexes, data distribution, and join methods.

DBMS Engine: The DBMS Engine is the core component executing the query. It coordinates with the
File Manager and Transaction Manager to ensure that the query is executed efficiently and that any
changes are appropriately managed.

File Manager + Transaction Manager: These components work alongside the DBMS Engine:

• File Manager handles the actual storage of data on disk and manages data retrieval.
• Transaction Manager ensures data integrity, handling concurrency, and maintaining
transactions' ACID (Atomicity, Consistency, Isolation, Durability) properties.

Physical Database: This is where the data is physically stored. With the help of the File and
Transaction Managers, the DBMS Engine interacts with the Physical Database to retrieve, update, or
delete data as per the SQL query.

2.3 Features of SQL


• Simplicity: SQL's syntax is simple and easy to understand, making it accessible to users with
minimal technical knowledge.
• Data Retrieval: SQL allows efficient data querying with the SELECT statement, supporting
filters, sorting, grouping, and more for customised results.
• Data Manipulation: SQL includes commands like INSERT, UPDATE, and DELETE to manage
and modify data within the database.
• Data Definition: With DDL commands like CREATE, ALTER, and DROP, SQL allows the creation
and modification of database structures.
• Data Control: SQL offers data security features with DCL commands like GRANT and REVOKE
to manage user permissions and access control.
• Transaction Control: SQL supports transactional operations using commands like COMMIT,
ROLLBACK, and SAVEPOINT to ensure data consistency.
• Functions and Expressions: SQL includes built-in functions (like aggregate and string
functions) to simplify data manipulation and analysis.
• Portability: SQL can be used across different platforms and DBMS software, enhancing its
adaptability in various environments.

Unit: 8 - Introduction to SQL 7


DCA1208 : Database Management System

• Data Integrity: SQL supports integrity constraints (PRIMARY KEY, FOREIGN KEY, and UNIQUE)
to maintain data accuracy and consistency.

Unit: 8 - Introduction to SQL 8


DCA1208 : Database Management System

3. SQL SYNTAX

SQL (Structured Query Language) syntax is the rules and guidelines for writing commands that
interact with relational databases. SQL syntax follows a specific format to define and manipulate data
and the database structure.

Here's a breakdown of crucial SQL syntax elements:

1. Keywords: SQL commands comprise keywords and reserved words in SQL. Examples include
SELECT, FROM, WHERE, INSERT, UPDATE, DELETE, CREATE, and DROP. These keywords are
case-insensitive, meaning SELECT is the same as select.
Example:

2. Clauses: SQL statements are often divided into clauses, such as SELECT, FROM, WHERE, GROUP
BY, ORDER BY, and HAVING. Each clause has a specific role in a query.
Example:

3. Operators: SQL operators include comparison operators (e.g., =, >, <, >=, <=, !=), logical
operators (e.g., AND, OR, NOT), and special operators (e.g., IN, BETWEEN, LIKE). They are used
in WHERE clauses to filter results.
Example:

4. Statements: SQL statements are complete instructions that operate on the database. They end
with a semicolon (;). Common statements include SELECT (retrieves data), INSERT (adds data),
UPDATE (modifies data), DELETE (removes data), and DDL statements like CREATE TABLE or
DROP TABLE.
Example:

Unit: 8 - Introduction to SQL 9


DCA1208 : Database Management System

5. Functions: SQL includes various functions for performing calculations on data, such as
aggregate functions (e.g., SUM(), AVG(), MAX(), MIN(), COUNT()) and string functions (e.g.,
UPPER(), LOWER(), CONCAT()).

• Comments: Comments can be added to SQL code to increase readability. They are ignored
during execution. In SQL, comments can be single-line (--) or multi-line (/* ... */).

Unit: 8 - Introduction to SQL 10


DCA1208 : Database Management System

4. CATEGORIES OF SQL COMMANDS

SQL commands are organised into several categories based on their specific functions in managing
and manipulating data within a database.

These categories include:

Categories of SQL
Commands

Data Definition Data Manipulation Data Control Transaction Control


Language (DDL) Language (DML) Language (DCL) Language (TCL)

Figure 2: Categories of SQL Commands

a. Data Definition Language (DDL) is used to define and modify the structure of a database. It
includes commands such as CREATE to create new database objects like tables, views, or
indexes; ALTER to change the structure of existing objects; and DROP to remove objects from
the database. These commands work on the schema level, meaning they affect the organisation
and metadata of the database. DDL commands are auto-committed, meaning their effects are
permanent and cannot be rolled back. This makes them critical for setting up and managing the
backbone of the database system.
b. Data Manipulation Language (DML) manages and manipulates the data stored within
database tables. Key commands include INSERT to add new records, UPDATE to modify existing
data and DELETE to remove records. DML commands work on the data level and directly impact
the rows in a table without changing the table's structure. Unlike DDL, DML changes are not
auto-committed, allowing users to roll back changes and ensuring flexibility and control over
data handling.
c. Data Control Language (DCL) focuses on managing access to the database and ensuring data
security. The primary commands are GRANT, which provides specific privileges to users, and
REVOKE, which removes those privileges. DCL is essential for controlling who can perform
Unit: 8 - Introduction to SQL 11
DCA1208 : Database Management System

actions like querying, inserting, or modifying data. These commands are vital in implementing
role-based access control, ensuring only authorised users can interact with sensitive data and
maintaining database integrity and security.
d. Transaction Control Language (TCL) ensures the consistency and integrity of data by
managing database transactions. The COMMIT command saves all the changes made during a
transaction, making them permanent. ROLLBACK undoes the changes in the current transaction,
reverting the database to its previous state. SAVEPOINT allows the creation of checkpoints
within a transaction, enabling partial rollbacks to specific points. TCL commands are crucial for
maintaining data reliability, particularly in environments where multiple transactions occur
simultaneously.

Each category uniquely contributes to a well-rounded approach to managing databases and ensuring
data security, integrity, and accessibility within SQL-based systems.

• The Data Definition Statements

To construct and administer the database, there are two significant DDL statements – CREATE and
DROP, which form the backbone of many commands:

CREATE DATABASE to create a database DROP DATABASE to remove a database CREATE TABLE to
create a table DROP TABLE to drop a table CREATE INDEX to create an index on a column DROP
INDEX to drop an index CREATE VIEW to create a view DROP VIEW to drop a view.

Some additional ones, such as ALTER TABLE or MODIFY DATABASE, may be vendor-specific.

• The Data Manipulation Statements

To manipulate data in tables directly or through views, we use the four standard DML statements:

o SELECT
o DELETE
o INSERT
o UPDATE

These statements and their functionality are now universally accepted, although the degree to which
these commands support this functionality varies somewhat between products. Compare the
functionality of different implementations of UPDATE, for example.

Unit: 8 - Introduction to SQL 12


DCA1208 : Database Management System

• Data Control

This deals with three issues:

a) Recovery and Concurrency

Concurrency is concerned with how multiple users operate upon the database.

Each user can reflect a transaction's updates using COMMIT or cancel all of them using ROLLBACK.

b) Security

Security has two aspects. The first is the VIEW mechanism. A view of a relation can be created, which
hides sensitive information and defines only the part of the relation that should be visible. A user can
then access this view.

CREATE VIEW LOCAL AS

SELECT * FROM SUPPLIER

WHERE SUPPLIER.CITY = 'Delhi'

The above view only reveals Delhi's suppliers.

The second is by using the GRANT operation. This grants one or moreaccess rights to perform the
data manipulative operations on the relations.

c) Integrity Constraints

While DCL does not define integrity constraints, it plays a crucial role in enforcing security policies
that indirectly support data integrity. For instance, by granting or revoking permissions, a database
administrator can control which users can insert, update, or delete data, thereby helping maintain the
data's overall integrity.

For example, one can specify that an attribute of a relation will not take onnull values.

Unit: 8 - Introduction to SQL 13


DCA1208 : Database Management System

5. DATA TYPES IN SQL

In SQL, data types define the data that can be stored in each table column. They help ensure that data
is stored efficiently and organised. These are the basic data types used in SQL. Choosing the
appropriate data type for each column ensures efficient storage and data integrity in the database.

Here are some of the basic SQL data types:

SQL data types

Numeric Character Date and Time Boolean

1. Numeric Data Types: These are used for storing numbers, both integers and decimals.

o INT: Stores whole numbers (integers) without decimal points. Commonly used for age,
quantity, etc.

▪ Example: INT (e.g., 10, 250)

o FLOAT: Stores numbers with decimal points using floating-point precision.

▪ Example: FLOAT (e.g., 10.5, 3.14)

o DECIMAL (p, s): Used for fixed-point numbers, where p is the precision (total number
of digits) and s is the scale (number of digits after the decimal).

▪ Example: DECIMAL (5, 2) (e.g., 123.45)

2. Character Data Types: These store strings, including text and alphanumeric characters.

o CHAR(n): Fixed-length character type, where n specifies the length. If the entered
string is shorter than n, it’s padded with spaces.

▪ Example: CHAR (5) (e.g., 'ABC ' for a 5-character length)

Unit: 8 - Introduction to SQL 14


DCA1208 : Database Management System

o VARCHAR(n): Variable-length character type, where n specifies the maximum length.


It only uses as much storage as needed for the string length.

▪ Example: VARCHAR (10) (e.g., 'Hello' takes 5 spaces)

o TEXT: Used for large strings, typically for paragraphs or blocks of text.

3. Date and Time Data Types: These store dates, times, and timestamps.

o DATE: Stores dates in the format YYYY-MM-DD.

▪ Example: DATE (e.g., 2024-11-14)

o TIME: Stores time values in the HH:MM: SS format.

▪ Example: TIME (e.g., 14:30:00)

o DATETIME: Stores both date and time in the format YYYY-MM-DD HH:MM: SS.

▪ Example: DATETIME (e.g., 2024-11-14 14:30:00)

o TIMESTAMP: Like DATETIME but also records the time zone or server’s time.

4. Boolean Data Type: Stores logical values, typically TRUE or FALSE.

o Example: BOOLEAN (e.g., TRUE, FALSE)

Unit: 8 - Introduction to SQL 15


DCA1208 : Database Management System

6. BASIC SQL COMMANDS

6.1 Data Definition Language (DDL)


Data Definition Language (DDL) in SQL is used to define, modify, and manage the structure of
database objects such as tables, indexes, views, and schemas. DDL commands are essential for
building a database's foundational structure and do not involve data manipulation. When executed,
these commands make structural changes to the database, and an automatic commit usually follows
each operation.

DDL commands are crucial for creating, organising, and maintaining a database's structure. They help
set up the foundational layout, define relationships between tables, and ensure data storage
alignment with the application's requirements.

Let us look at the main DDL commands with examples:

DDL
Commands

CREATE ALTER DROP

1. CREATE: This command creates new database objects like tables, indexes, or views.
• Example (Creating a Table):

Unit: 8 - Introduction to SQL 16


DCA1208 : Database Management System

This command creates a table named Employees with columns for EmployeeID, FirstName, LastName,
Department, and Salary.

2. ALTER: The ALTER command modifies the structure of an existing database object, such as
adding, deleting, or modifying columns in a table.
• Example (Adding a Column):

This adds a new column called DateOfBirth to the Employees table.


• Example (Modifying a Column):

This modifies the Salary column in the Employees table to increase the precision to 12 digits.

• Example (Deleting a Column):

The ALTER command in SQL is used to modify the structure of an existing database table. If
you want to delete a column from a table using SQL, use the ALTER TABLE command followed
by the DROP COLUMN statement.

Here is an example of how to use the ALTER TABLE command to delete a column:

For instance, if you have a table named Employees and you want to delete a column named
Age, you would write:

3. DROP: This command deletes database objects like tables, indexes, or views. Note that
dropping a table permanently removes all the associated data and structure.
• Example (Dropping a Table):

Unit: 8 - Introduction to SQL 17


DCA1208 : Database Management System

This command deletes the Employees table along with all its data.

Importance of Data Definition Language (DDL)

Data Definition Language (DDL) is crucial in database management systems (DBMS) as it defines,
modifies, and manages the structure of database objects, including tables, indexes, views, and
schemas. Here’s why DDL plays an essential role in DBMS:

o Establishes Database Structure: DDL is fundamental in creating the structure of a database


and setting up tables, indexes, views, and other objects. Commands like CREATE and ALTER
define the backbone of the database, specifying data types, constraints, and relationships,
enabling a structured and organised data storage system.

o Enforces Data Integrity: DDL commands allow the application of constraints (e.g., primary
key, foreign key, unique) on tables to enforce data integrity and consistency. This ensures that
only valid and accurate data enters the database, reducing the likelihood of data anomalies and
errors, particularly in large or critical systems.

o Facilitates Database Maintenance: With DDL, database administrators can modify the
schema over time to adapt to evolving requirements. ALTER and DROP commands make it
possible to manage changes without disturbing the entire database, enabling smooth upgrades
and maintenance of the database structure.

o Enhances Query Performance: Creating indexes through DDL improves data retrieval speed,
which can significantly enhance the performance of database queries. Well-designed indexes
allow faster data access, making DDL essential for optimising database performance, especially
when dealing with large datasets.

o Supports Schema Evolution: DDL enables database schema evolution, allowing structures to
change as business needs grow. Adding new tables, columns, or constraints and adjusting data
relationships over time ensures that the database remains relevant and adaptable without
requiring a complete rebuild.

o Auto-Commit Nature for Security: DDL commands are auto-committed, meaning changes
are immediately saved to the database. This is particularly important for ensuring the database
structure is secure and stable, as schema modifications cannot be rolled back, providing a
controlled environment for structural changes.

Unit: 8 - Introduction to SQL 18


DCA1208 : Database Management System

6.2 Data Manipulation Language (DML) Commands


Data Manipulation Language (DML) in a Database Management System (DBMS) interacts with data
stored within database tables. DML commands allow users to retrieve, insert, update, and delete data,
making them essential for managing and modifying database content.

DML
Commands

INSERT SELECT UPDATE DELETE

We will look into these commands in depth in Unit -10.

6.3 Data Control Language (DCL)


Data Control Language (DCL) in Database Management Systems (DBMS) is a subset of SQL used to
control access to data stored in the database. DCL commands primarily focus on granting and
revoking permissions to ensure database security and proper user management. It helps database
administrators regulate who can perform specific operations, thereby maintaining data integrity and
confidentiality.

Key DCL Commands include:

DCL
Commands

GRANT REVOKE

Unit: 8 - Introduction to SQL 19


DCA1208 : Database Management System

1. GRANT Command:

o The GRANT command provides specific privileges to a user or role. These privileges
include SELECT, INSERT, UPDATE, DELETE, or other database actions.

Syntax:

Example:

This example grants the user "John" the ability to SELECT and INSERT records in the "Employees"
table.

2. REVOKE Command:

• The REVOKE command is used to withdraw previously granted privileges from a user or role.

Syntax:

Example:

This example revokes the INSERT privilege on the "Employees" table from the user "John."

Importance of DCL:

• Access Control: Ensures that only authorised users can perform specific operations, such as
reading or modifying sensitive data.

• Data Security: Protects sensitive information from unauthorised access and potential
breaches.

Unit: 8 - Introduction to SQL 20


DCA1208 : Database Management System

• Role-Based Permissions: Allows the creation of roles with specific privileges, simplifying
user management in large systems.

• Auditing and Compliance: This helps track who has access to what, making it easier to
comply with data protection regulations.

Practical Use Cases:

• Limiting access to sensitive customer data in a database.

• Allow managers to update salary details only while enabling employees to view their records.

• Setting up read-only roles for reporting tools or analytics teams.

6.4 Transaction Control Language (TCL)


Transaction Control Language (TCL) commands in DBMS manage transactions in a database.
Transactions are a sequence of operations executed as a single unit of work. TCL commands ensure
data consistency and integrity, particularly in multi-user and concurrent environments.

• Key TCL Commands:

TCL
Commands

COMMIT ROLLBACK SAVEPOINT

1. COMMIT:

• The COMMIT command permanently saves all the changes made during the current
transaction in the database. Once a COMMIT is executed, the changes cannot be undone,
making it essential to ensure data correctness before committing.

Syntax:

Unit: 8 - Introduction to SQL 21


DCA1208 : Database Management System

Example:

Here, the salary increase for HR employees is saved permanently.

2. ROLLBACK:

• The ROLLBACK command undoes all changes made during the current transaction, reverting
the database to its previous state. It is used in cases where errors occur or the user decides not
to save the changes made during the transaction.

Syntax:

Example:

This example undoes the deletion of employees in the "Sales" department.

3. SAVEPOINT:

• The SAVEPOINT command is used to create intermediate points within a transaction. This
allows rolling back to a specific savepoint instead of undoing the entire transaction.

Syntax:

Example:

Unit: 8 - Introduction to SQL 22


DCA1208 : Database Management System

In this example, changes made to the "Marketing" department are preserved, while changes to the
"IT" department are undone.

Importance of Transaction Control Language (TCL)

Transaction Control Language (TCL) commands are critical in managing database transactions and
ensuring data integrity, consistency, and reliability in a Database Management System (DBMS).

Below are the key reasons why TCL is important:

• Ensures Data Consistency: TCL commands like COMMIT and ROLLBACK help ensure that
the database remains consistent by saving valid changes or reverting to a previous stable state
in case of errors. This is crucial for maintaining accurate and reliable data.
• Supports Atomicity in Transactions: TCL enforces the "all-or-nothing" principle of
transactions. All operations within a transaction are completed successfully (COMMIT), or
none are applied (ROLLBACK), thereby supporting the atomicity property of ACID compliance.
• Facilitates Error Recovery: TCL commands enable users to recover from errors by rolling
back incomplete or incorrect changes made during a transaction. This ensures the database is
not left unstable or partially updated after an error.
• Enables Fine-Grained Control with SAVEPOINT: The SAVEPOINT command allows the
creation of intermediate checkpoints within a transaction. Users can roll back to specific
savepoints rather than undoing the entire transaction, providing greater flexibility in
managing complex database operations.
• Improves Data Reliability in Concurrent Environments: In multi-user environments, TCL
commands help manage simultaneous transactions, preventing issues like data corruption or
conflicts. They ensure that operations are executed in a controlled manner, maintaining
database reliability.

Unit: 8 - Introduction to SQL 23


DCA1208 : Database Management System

7. SUMMARY
• SQL (Structured Query Language) is a standard programming language used to interact with
relational databases. It enables users to define, manipulate, query, and control database data.
SQL typically involves writing queries that retrieve or modify data from database tables, with
the database engine interpreting and executing these queries. SQL's versatility and ease of use
make it essential for database management and data-driven applications. Key features of SQL
include its standardisation, ability to handle large volumes of data, support for complex
queries, and compatibility with various database systems.
• SQL syntax provides the structural rules for writing SQL queries. Commands are typically
written in clauses such as SELECT, FROM, WHERE, and ORDER BY, with a semicolon (;) used
to terminate each query. Keywords are case-insensitive, and operators like =, <, and > help
filter or compare data. Functions and aggregate operators (e.g., SUM(), AVG()) are integral to
SQL syntax for performing calculations and analysing data. Comments can also be added to
increase query readability. Properly understanding SQL syntax is crucial for writing accurate
and efficient database queries.
• SQL commands are divided into four main categories: Data Definition Language (DDL), Data
Manipulation Language (DML), Data Control Language (DCL), and Transaction Control
Language (TCL). DDL defines and manages database structure using commands like CREATE,
ALTER, and DROP. DML handles data within tables using commands like INSERT, UPDATE,
and DELETE. DCL controls permissions and user access with GRANT and REVOKE, while TCL
ensures transaction integrity and data consistency using commands like COMMIT, ROLLBACK,
and SAVEPOINT.
• Data types in SQL define the nature of data that can be stored in a column, ensuring consistency
and accuracy. Common data types include INT for integers, VARCHAR for variable-length
strings, DATE for date values, and FLOAT for decimals. Each database management system
may have variations or additional data types. Choosing the correct data type is essential for
optimising storage and ensuring efficient data processing, especially in large databases.
• SQL commands are the building blocks of database operations, categorised based on their
functionality. DDL commands define the structure and schema, such as creating tables. DML
commands focus on data manipulation, enabling users to add, update, or delete records. DCL
commands manage user permissions to safeguard database security. TCL commands ensure

Unit: 8 - Introduction to SQL 24


DCA1208 : Database Management System

transaction consistency and data reliability, allowing users to save or revert changes during a
transaction. Understanding and applying these commands is critical for efficiently managing
databases and ensuring data integrity.

Unit: 8 - Introduction to SQL 25


DCA1208 : Database Management System

8. GLOSSARY

The SQL process involves interpreting and executing SQL queries to


SQL Process - interact with the database through components like query processors,
transaction managers, and DBMS engines.
This refers to fetching data from a database using queries, primarily
Data Retrieval - through the SELECT statement, often with filtering and sorting for
precise results.

Modifying data in a database using commands such as INSERT, UPDATE,


Data Manipulation -
and DELETE, enabling changes to stored records.

Query Language A component of DBMS that interprets and translates SQL queries into
-
Processor low-level instructions for the database engine to execute.

The process of analysing SQL queries for syntax correctness and


Parsing -
preparing them for execution by the database engine.

Transaction A DBMS component that manages transactions, ensuring they adhere to


-
Manager ACID properties for consistency, isolation, and durability.

The core software component of a DBMS that handles database storage,


DBMS Engine -
retrieval, and execution of SQL commands.

A mode in SQL where each command is automatically committed to the


Auto-Commit - database, making permanent changes without explicit COMMIT
commands.

A TCL command that creates intermediate checkpoints in a transaction,


SAVEPOINT -
allowing partial rollbacks to specific states.

The assurance of data accuracy, consistency, and validity throughout its


Data Integrity -
lifecycle within a database.

Data Storage Organising data in physical storage to optimise space utilisation and
-
Alignment improve retrieval efficiency.

Any definable structure within a database, such as tables, views, indexes,


Database Object -
or stored procedures, that stores or interacts with data.

Unit: 8 - Introduction to SQL 26


DCA1208 : Database Management System

9. SELF-ASSESSMENT QUESTIONS

Multiple Choice Questions:


1 What is SQL primarily used for in a Database Management System (DBMS)?
a) Performing arithmetic calculations
b) Managing and manipulating relational databases
c) Designing user interfaces
d) Encrypting data
2 Which component of the SQL process ensures the query syntax is correct?
a) DBMS Engine
b) Optimiser
c) Parser
d) File Manager
3 What does the Transaction Manager in the SQL process handle?
a) Data storage on disk
b) Translation of SQL queries
c) Concurrency and maintaining ACID properties
d) Execution of SQL queries
4 Which of the following is NOT a function of the File Manager in the SQL process?
a) Managing query optimisation
b) Retrieving data for the DBMS Engine
c) Handling data storage on disk
d) Supporting data retrieval processes
5 Which of the following features of SQL ensures data consistency during transactional
operations?
a) Data Retrieval
b) Transaction Control
c) Data Integrity
d) Portability
6 Which SQL commands manage user permissions and access control?
a) SELECT and INSERT

Unit: 8 - Introduction to SQL 27


DCA1208 : Database Management System

b) GRANT and REVOKE


c) COMMIT and ROLLBACK
d) CREATE and ALTER
7 What is the purpose of the WHERE clause in SQL?
a) To group data based on a specific column
b) To filter results based on specific conditions
c) To sort data in ascending or descending order
d) To define the database structure
8 Which of the following is NOT a valid SQL operator?
a) =
b) BETWEEN
c) CONCAT
d) LIKE
9 Which of the following SQL command categories is responsible for defining and modifying the
structure of a database?
a) Data Manipulation Language (DML)
b) Data Definition Language (DDL)
c) Data Control Language (DCL)
d) Transaction Control Language (TCL)
10 What is the primary function of the ROLLBACK command in SQL?
a) Save changes permanently in the database
b) Revert the database to its previous state within a transaction
c) Grant specific privileges to a user
d) Create a new table in the database
11 Which of the following statements is a Data Definition Language (DDL) command?
a) SELECT
b) INSERT
c) CREATE TABLE
d) UPDATE
12 What does the COMMIT command in Data Control Language (DCL) do?
a) Creates a new database
b) Grants access rights to a user

Unit: 8 - Introduction to SQL 28


DCA1208 : Database Management System

c) Reflects the updates of a transaction permanently


d) Rolls back a transaction
13 Which of the following SQL data types is best suited for storing a fixed-length string of exactly
8 characters?
a) CHAR(8)
b) VARCHAR(8)
c) TEXT
d) BOOLEAN
14 What is the primary difference between SQL's FLOAT and DECIMAL data types?
a) FLOAT is used for integers, while DECIMAL is used for text data.
b) FLOAT stores numbers with fixed-point precision, whereas DECIMAL stores numbers
with floating-point precision.
c) FLOAT stores numbers with floating-point precision, whereas DECIMAL stores
numbers with fixed-point precision.
d) FLOAT and DECIMAL are identical and can be used interchangeably.
15 Which of the following statements about Data Definition Language (DDL) in SQL is TRUE?
a) DDL is used to manipulate data in database tables.
b) DDL commands are not auto-committed.
c) DDL is used to define, modify, and manage the structure of database objects.
d) DDL commands are primarily focused on querying data from tables.
16 Which DDL command is used to delete a database object and its associated data permanently?
a) CREATE
b) ALTER
c) DROP
d) DELETE
17 Which of the following commands in Data Control Language (DCL) is used to grant specific
privileges to a user or role?
a) SELECT
b) GRANT
c) REVOKE
d) INSERT
18 What is the primary purpose of the REVOKE command in Data Control Language (DCL)?

Unit: 8 - Introduction to SQL 29


DCA1208 : Database Management System

a) To assign roles to users


b) To provide privileges to a user or role
c) To withdraw previously granted privileges
d) To modify the structure of a table
19 Which of the following statements about the COMMIT command is true?
a) The COMMIT command undoes all changes made during the current transaction.
b) Changes saved using COMMIT can be undone later.
c) The COMMIT command permanently saves all changes made during the current
transaction.
d) COMMIT allows rolling back to specific savepoints within a transaction.
20 What is the primary purpose of the SAVEPOINT command in a transaction?
a) To permanently save all changes made during the transaction.
b) To undo all changes made during the transaction.
c) To create intermediate checkpoints within a transaction for partial rollbacks.
d) To ensure that all operations within a transaction are completed successfully.

Unit: 8 - Introduction to SQL 30


DCA1208 : Database Management System

10. TERMINAL QUESTIONS


1. Define SQL.
2. Describe the SQL Process.
3. List the Features of SQL.
4. Explain the syntax for i) Keywords, ii) Clauses, iii)Operators, iv)Operators, v)Statements.
5. Classify the Categories of SQL Commands.
6. Identify the Data Types in SQL.
7. Under Data Definition Language (DDL), explain the syntax & an example for CREATE() and
ALTER() Commands.
8. List the Importance of Data Definition Language (DDL).
9. What are the Data Manipulation Language (DML) Commands?
10. With an example explain GRANT() & REVOKE() Command under DCL.
11. Outline the Practical Use Cases of DCL.
12. Define Transaction Control Language (TCL).
13. Explain the COMMIT() & ROLLBACK() Command with an example.
14. Justify the Importance of Transaction Control Language (TCL)
15. Write a SQL query to create a table named "Students" with the following columns:
• StudentID: Integer, Primary Key
• Name: Variable character (maximum length 50)
• Age: Integer
• EnrollmentDate: Date

Unit: 8 - Introduction to SQL 31


DCA1208 : Database Management System

11. ANSWERS
Self-Assessment Questions
1. (b) Managing and manipulating relational databases
2. (c) Parser
3. (c) Concurrency and maintaining ACID properties
4. (a) Managing query optimisation
5. (b) Transaction Control
6. (b) GRANT and REVOKE
7. (b) To filter results based on specific conditions
8. (c) CONCAT
9. (b) Data Definition Language (DDL)
10. (b) Revert the database to its previous state within a transaction
11. (c) CREATE TABLE
12. (c) Reflects the updates of a transaction permanently
13. (a) CHAR(8)
14. (c) FLOAT stores numbers with floating-point precision, whereas DECIMAL stores numbers
with fixed-point precision.
15. (c) DDL is used to define, modify, and manage the structure of database objects.
16. (c) DROP
17. (b) GRANT
18. (c) To withdraw previously granted privileges
19. (c) The COMMIT command permanently saves all changes made during the current
transaction.
20. (c) To create intermediate checkpoints within a transaction for partial rollbacks.

Terminal Questions Answers


Answer 1: SQL (Structured Query Language) is a standard language for interacting with relational
databases. It allows users to define, manipulate, retrieve, and manage data efficiently. Refer to
section 2 for more details.

Unit: 8 - Introduction to SQL 32


DCA1208 : Database Management System

Answer 2: The SQL process involves writing SQL queries that are sent to a database server. The
server interprets and executes the queries and returns the results to the client application. Refer to
section 2.2 for more details.

Answer 3: SQL is simple and user-friendly, supports data manipulation and definition, provides data
integrity and security, and is platform-independent, making it versatile for database management.
Refer to section 2.3 for more details.

Answer 4:

• Keywords: Reserved words used to perform specific tasks (e.g., SELECT, INSERT).
• Clauses: Components of statements (e.g., WHERE, GROUP BY).
• Operators: Symbols for comparisons (e.g., =, <, >).
• Statements: Full commands (e.g., SELECT column_name FROM table_name). Refer to section
2.4 for more details.

Answer 5:

Data Definition Language (DDL): CREATE, ALTER.

• Data Manipulation Language (DML): INSERT, UPDATE.

• Data Control Language (DCL): GRANT, REVOKE.

• Transaction Control Language (TCL): COMMIT, ROLLBACK. Refer to section 4 for more
details.

Answer 6: Common data types include INT, VARCHAR, CHAR, DATE, FLOAT, and BOOLEAN. They
define the kind of data that can be stored in a column. Refer to section 5 for more details.

Answer 7:

CREATE Syntax: CREATE TABLE table_name (column_name datatype);

Example: CREATE TABLE Employee (ID INT, Name VARCHAR(50));

ALTER Syntax: ALTER TABLE table_name ADD column_name datatype;

Example: ALTER TABLE Employee ADD Salary FLOAT; Refer to section 6 for more details.

Unit: 8 - Introduction to SQL 33


DCA1208 : Database Management System

Answer 8: DDL commands define and modify the structure of database objects, ensure data integrity
through constraints, create relationships between tables, and manage schema effectively. Refer to
section 6 for more details.

Answer 9: DML commands are used to manipulate data:

• INSERT: Adds data to tables.

• UPDATE: Modifies existing data.

• DELETE: Removes data from tables. Refer to section 6 for more details.

Answer 10:

• GRANT Syntax: GRANT permission ON table_name TO user;


Example: GRANT SELECT ON Employee TO User1;

• REVOKE Syntax: REVOKE permission ON table_name FROM user;


Example: REVOKE SELECT ON Employee FROM User1; Refer to section 6 for more details.

Answer 11: DCL commands manage user access and privileges, ensure data security, restrict
unauthorised data manipulation, and define roles for multi-user environments. Refer to section 6
for more details.

Answer 12: TCL commands manage transactions in a database, ensuring atomicity, consistency, and
reliability. They include COMMIT, ROLLBACK, and SAVEPOINT. Refer to section 6 for more details.

Answer 13: COMMIT: Saves all changes permanently. Example: UPDATE Employee SET Salary = 5000
WHERE ID = 1; COMMIT;

• ROLLBACK: Reverts changes. Example: DELETE FROM Employee WHERE ID = 2; ROLLBACK;


Refer to section 6 for more details.

Answer 14: TCL ensures data integrity and consistency, supports error recovery, enforces atomicity,
and manages concurrent transactions in multi-user environments. Refer to section 6 for more
details.

Unit: 8 - Introduction to SQL 34


DCA1208 : Database Management System

Answer 15:

Unit: 8 - Introduction to SQL 35


DCA1208 : Database Management System

12. To Calculate Relative Frequency, count occurrences of each colour (Red, Blue, Green, Yellow)
12. REFERENCES
and calculate the percentage of each colour relative to the total number of responses. You can derive

• Fundamentals of Database Systems, Ramez Elmasri and Shamkant B. Navathe, 7th edition,
Pearson, 2016, ISBN-10: 0133970779.
• Database Management Systems, Raghu Ramakrishnan and Johannes Gehrke,3rd Edition,
McGraw-Hill Education,2002 (for the 3rd Edition), ISBN-13: 978-0072465631
• Database System Concepts, Abraham Silberschatz, Henry F. Korth, S.Sudarshan,7th Edition,
McGraw-Hill Education, 2019, ISBN-10: 0073523321

Unit: 8 - Introduction to SQL 36

You might also like