Database Management System.
Database Management System.
DCA1208
DATABASE MANAGEMENT SYSTEM
Unit: 1 - Introduction to DBMS 1
DCA1208: Database Management System
Unit - 1
Introduction to DBMS
DCA324
KNOWLEDGE MANAGEMENT
Unit: 1 - Introduction to DBMS 2
DCA1208: Database Management System
TABLE OF CONTENTS
1 Introduction - -
4
1.1 Objectives - -
4 What is DBMS - -
9-11
4.1 Importance of DBMS - -
5 Overview of DBMS - -
5.1 Components of DBMS 2 -
12-16
5.2 Key Concepts in DBMS - -
5.3 Database System Applications - -
6 Summary - - 17-18
7 Glossary - - 19-20
9 Terminal Questions - - 25
10 Answers - -
10.1 Self-Assessment Questions - - 26-27
10.2 Terminal Questions - -
11 References - - 28
1. INTRODUCTION
In this unit, we will introduce the basic concepts of DBMS. Database technology development has
evolved rapidly over the past three decades since relational database systems' rise and eventual
dominance. While many specialised database systems (spatial, object-oriented, multimedia, etc.) have
found substantial user communities in the science and engineering sections, relational systems
remain the preferred database technology for business enterprises.
A database is a collection of related information stored to be available to several users for various
purposes. The content of a database is obtained by combining data from all the different sources in
an organisation so that data are available to all users and replicated data can be minimised or
eliminated. A computer database gives us an electronic filing system, which has many ways of cross-
referencing and allows the user several ways to retrieve and reorganise data.
1.1. Objectives
By the end of Unit 1, the learners should be able to:
• Define Database management system
• List the characteristics of the Database Approach
• Discuss the importance of Database Management
Systems
• Classify the components of DBMS
• Identify the Applications of Database Systems
2. CONCEPT OF DATABASE
The database is a collection of related data. A data item is the smallest identified unit of data that has
value in the real world – for example, last name, first name, street address, ID number, or political
party – and is the fundamental component of a file in a file system. A group of related data items
considered a single unit by an application is called a record. Examples of types of records are
salesperson, customer, order, product,and department. A file is a collection of several records of a
single type.
Fig.1 Database
A database is a more complex object; it is a collection of interrelated stored data that facilitates the
requirements of several users within one or more organisations, that is, interrelated collections of
many different types oftables. The encouragement for using databases rather than files includesless
redundancy of data, greater availability to a diverse set of users, and integration of data for easier
access to and updating complextransactions.
Database System: It consists of the DBMS software and the data itself.Sometimes, the applications
are also included.
Database management systems software examples include Oracle, SQL Server, MS Access, DB2,
SYBASE, and INFORMIX.
A database can handle accounting, filing, and business inventory and use the information in its files
to prepare summaries, estimates, and other reports. There can be a database that stores books,
newspaper articles, magazines, and comics. There is already a well-defined market for specialised
knowledge for a small number of customers on practically every issue.
The management of data in a database system is handled by a database management system, which
is a general-purpose software programme. The database management system is the major software
component of a database system. Therefore, a database management system is a combination of
hardware and software that can be used to set up and monitor a database and can manage the
retrieval and updating of the database that has been stored in it. Most database management systems
have the following capabilities:
• Creating a file, addition to data, modification of data, deletion of data,creation, addition, and
deletion of files.
• Retrieving data selectively or collectively.
• The data can be sorted or indexed at the user's discretion and direction.
• The system can generate several reports. These may be standardised reports or generated
explicitly according to user requirements.
• Mathematical functions can be performed on data stored in the database and manipulated
using these functions to perform the desired calculations.
• Restriction of Unauthorised Access: Not all users have the same access privileges in a
database system. Some users have read-only access, while others have read and write privileges.
DBMSs provide a security subsystem to create and control different user accounts, restricting
unauthorised access.
• Data Independence: A significant advantage of DBMSs is data independence. System data
descriptions (metadata) are separated from application programs, and changes to data
structures are managed by the DBMS rather than embedded in the programs.
• Transaction Processing: DBMSs include concurrency control subsystems to ensure data
consistency and validity during transaction processing, even when multiple users update the
same information simultaneously.
• Provision for Multiple Views of Data: DBMSs allow multiple users to access the database
individually or simultaneously without knowing how and where the data is stored.
• Backup and Recovery Facilities: Backup and recovery methods protect data from loss. DBMSs
provide processes for backing up and recovering data independently of network backups.
Recovery subsystems ensure the database can be restored to its original state if a system fails
during a complex update process.
These characteristics and features highlight the advantages of the database approach over traditional
file-based systems
Defining a database involves specifying the data types, structures, and constraints of the data to be
stored in the database. The DBMS also stores the database definition or descriptive information as a
database catalogue or dictionary called meta-data.
Constructing the database is the process of storing the data on a storage medium controlled by the
DBMS.
Manipulating a database includes functions such as querying it to retrieve specific data, updating it
to reflect changes in the mini world, and generating reports from the data.
Sharing a database allows multiple users and programs to access the database simultaneously.
Database management systems software examples include Oracle, SQL Server, MS Access, DB2,
SYBASE, and INFORMIX.
A database can handle accounting, filing, and business inventory and use the information in its files
to prepare summaries, estimates, and other reports. There can be a database that stores books,
newspaper articles, magazines, and comics. There is already a well-defined market for specialised
knowledge for a small number of customers on practically every issue.
The management of data in a database system is handled by a database management system, which
is a general-purpose software programme. The database management system is the major software
component of a database system. Therefore, a database management system is a combination of
hardware and software that can be used to set up and monitor a database and can manage the
retrieval and updating of the database that has been stored in it. Most database management systems
have the following capabilities:
• Creating a file, addition to data, modification of data, deletion of data,creation, addition, and
deletion of files.
• Retrieving data selectively or collectively.
• The data can be sorted or indexed at the user's discretion and direction.
• The system can generate several reports. These may be standardised or specifically generated
according to specific user requirements.
• Mathematical functions can be performed on data stored in the database and manipulated
using these functions to perform the desired calculations.
1. Data Integrity and Security: DBMS ensures data accuracy and consistency through constraints
and rules that prevent invalid data entry. It also provides robust security mechanisms, including
authentication and encryption, to protect data from unauthorised access.
2. Efficient Data Management: DBMS organises data in a structured manner, making it easy to
retrieve, update, and manage. Efficient query processing and indexing techniques in DBMS
enable fast and accurate data retrieval.
3. Data Abstraction and Independence: DBMS abstracts the complexities of data storage from
the end users, allowing them to interact with data at a higher level. DBMS supports logical and
physical data independence, allowing changes in the schema at one level without affecting other
levels.
4. Concurrency Control: DBMS allows multiple users to access and manipulate data
simultaneously without interference, ensuring data consistency and isolation. DBMS manages
transactions to ensure that all operations within a transaction are completed successfully,
maintaining database integrity.
5. Data Redundancy Control: DBMS reduces data redundancy by integrating data into a single
database system, eliminating duplicate data entries. The normalisation process in DBMS helps
minimise redundancy and improve data integrity.
6. Backup and Recovery: DBMS provides automated backup mechanisms to safeguard data
against accidental loss or corruption. In case of a system failure, DBMS has recovery procedures
to restore the database to its last consistent state.
7. Data Sharing: DBMS supports data sharing among multiple users and applications, promoting
collaborative data usage. It integrates data from various sources, providing a unified view of the
data for decision-making and analysis.
8. Improved Data Access: DBMS uses query languages like SQL to enable users to interact with
the database easily, making data access more efficient. It supports various reporting and
analysis tools that help generate meaningful insights from the data.
9. Support for Data Integrity Constraints: DBMS enforces integrity constraints (e.g., primary
keys, foreign keys) to maintain the accuracy and reliability of data. It provides mechanisms to
define and enforce data validation rules, ensuring data quality.
5. OVERVIEW OF DBMS
DBMS is a general-purpose software system that facilitates the processes of defining, constructing,
manipulating, and sharing databases among various users and applications.
Each component plays a very important role in the database management system environment. The
significant components of a database management system are:
HARDWARE: The hardware is the computer system used to keep and access the database. It consists
of physical and electronic devices such as computers (together with associated I/O devices like disk
drives), storage devices, I/O channels, electromechanical devices that interface between computers
and real-world systems, and so on. It is impossible to implement the DBMS without the hardware
devices; in a network, a powerful computer with high data processing speed and a storage device with
large storage capacity is required as the database server.
SOFTWARE: The main component of a DBMS is the software. It is the set of programs used to handle
the database and to control and manage the overall computerised database. DBMS software itself is
the most important software component in the overall system.
DATA: Data is the most important component of the DBMS. The main purpose of DBMS is to process
data. In DBMS, databases are defined and constructed, and then data is stored, updated, and retrieved
to and from the databases. The database contains both the actual data and the metadata.
PROCEDURES: Procedures refer to the instructions and rules that help design the database and use
the DBMS. The users who operate and manage the DBMS require documented procedures when they
use it hot or run the database management system.
USERS: Database users are the people who manage the databases and perform various operations on
them in the database system. Three kinds of people play different roles in database systems.
• Application Programmers: These users implement specific application programs to access the
stored data. They must be familiar with the DBMSs to accomplish their task.
• Database Administrators: These may be one person or a group of people in an organisation
responsible for authorising access to the database, monitoring its use, and managing all the
resources necessary to support the use of the entire database system.
• End-Users: End users are the people whose jobs require access to a database for querying,
updating and generating reports.
The database schema is also called visual or logical architecture, as it tells us how the data are
organised in a database.
Data Constraint Sometimes, we restrict or limit the type of data that can be inserted in one or more
columns of a table by specifying one or more constraints on that column(s) while creating the table.
Constraints are used to ensure the accuracy and reliability of data in the database.
Meta-data or Data Dictionary The database schema, along with various constraints on the data, is
stored by DBMS in a database catalogue or dictionary, called meta-data. Meta-data is data about the
data.
Database Instance When we define the database structure or schema, the database's state is empty,
i.e., no data entry is present. After loading data, the state or snapshot of the database at any given time
is the database instance. We may then retrieve data through queries or manipulate data through
updation, modification, or deletion. Thus, the database's state can change; thus, a database schema
can have many instances at different times.
Query A query is a request to a database to obtain information in a desired way. A query can get data
from one table or a combination of tables. For example, “find names of all those students present on
Attendance Date 2000-01-02” is a query to the database. The user must write a query using a query
language to retrieve or manipulate data.
Data Manipulation Modification of the database consists of three operations viz. Insertion, Deletion
or Update. Suppose Rivaan joins as a new student in the class then the student details need to be
added in STUDENT as well as in GUARDIAN files of the Student Attendance database. This is called
Insertion operation on the database. If a student leaves the school, his/her data and guardian details
must be removed from the STUDENT, GUARDIAN and ATTENDANCE files, respectively. This is called
the Deletion operation on the database. Suppose Atharv’s Guardian has changed his mobile number;
his GPhone should be updated in the GUARDIAN file. This is called the Update operation on the
database.
Database Engine Database engine is the underlying component or set of programs a DBMS uses to
create a database and handle various queries for data retrieval and manipulation.
• Sales: Databases store essential information about customers, products, and purchases in the
sales and retail sectors. These systems support sales analytics, inventory management, and
personalised customer experiences by tracking buying patterns and preferences, optimising
marketing strategies, and operational efficiency.
• Telecommunication: Telecommunication companies rely on databases to manage vast
volumes of data related to customer calls, prepaid card balances, monthly billing, and network
infrastructure. These systems ensure seamless communication services, enabling accurate
billing and customer support across global networks.
• Manufacturing: In manufacturing, databases oversee inventory management across
warehouses, track supply chain operations, monitor production processes, and manage orders
for various products. This integration optimises manufacturing workflows, enhances inventory
accuracy, and supports efficient logistics and distribution strategies.
• Human resources: Human resources departments utilise databases to maintain employee
records, manage payroll taxes, administer benefits, and process payroll. These systems
streamline HR operations, ensure compliance with employment regulations, and facilitate
timely and accurate payroll processing, thereby supporting employee satisfaction and
organisational efficiency.
• Airlines: Airlines pioneered geographically distributed databases, enabling terminals
worldwide to access centralised systems for reservations, cancellations, and schedule
management. This approach revolutionised global travel operations by ensuring real-time
updates and seamless passenger service across diverse locations.
So, we can say that databases are essential to almost all enterprises today. During the last four
decades of the twentieth century, the useof databases grew in all enterprises. Few people engaged
directly with database systems in the early days, but they did so indirectly through printed reports
such as credit card statements or through agents such as bank tellers and airline reservation agents
– without recognising it. Then automated teller machines came along and let users interact directly
with databases. Phone interfaces to computers also facilitated users to deal with databases directly.
For example – a caller could dial a number and press phone keys to enter information or select
alternative options to find flight arrival/departure times.
The Internet revolution sharply increased direct user access to databases during the late 1990s.
Organisations converted many phone interfaces to databases into Web interfaces and facilitated
different online services and information. For example, when you access a website, information about
you may be retrieved from a database to selectwhich advertisements should be shown.
When you explore a book or music collection in an online retailer, you access data stored in a database.
The data is pulled from the bank's database system when you visit a bank's website to get your
account balance and transaction history. Moreover, data related to Web access may be stored in a
database. Thus, although user interfaces hide details of access to a database, and most people are
unaware they are dealing with a database, accessing databases forms an essential part of almost
everyone’s life these days. The importance of database systems can be judged in another way – today,
database system vendors like Oracle are among the largest software companies in the world, and
database systems form an essential part of the product line of more diversified companies like IBM
and Microsoft.
6. SUMMARY
A Database Management System (DBMS) is a software system that enables users to define, create,
maintain, and control access to databases. It provides an efficient and systematic way to manage data,
allowing for data storage, retrieval, and manipulation in a structured manner. DBMSs are essential
for managing large volumes of data and ensuring that data is accurate, consistent, and secure. They
form the backbone of various applications across different industries, enhancing data management
and operational efficiency.
The database approach offers several key characteristics that distinguish it from traditional file-based
systems. These include data abstraction, which hides the complexity of data structures; data
independence, allowing changes at one level without affecting others; and reduced data redundancy
through normalisation. The database also enhances data integrity, security, and concurrent access,
enabling multiple users to interact with the data simultaneously without conflicts. These
characteristics collectively improve data management, accessibility, and reliability.
A DBMS is a collection of programs that facilitates database creation, maintenance, and use. It serves
as an intermediary between users and the database, ensuring data is stored efficiently and quickly
retrieved. DBMSs support various functions, including defining data structures, querying data,
updating records, and managing transactions. Popular examples of DBMSs include MySQL, Oracle,
SQL Server, and PostgreSQL, each offering different features tailored to specific requirements.
The importance of DBMSs lies in their ability to provide a systematic and efficient way to handle data.
They ensure data integrity, security, and consistency, which are critical for decision-making and
operational processes. DBMSs also facilitate data sharing among multiple users and applications,
support concurrent data access, and provide robust backup and recovery mechanisms. By minimising
data redundancy and optimising resource utilisation, DBMSs reduce the overall data management
cost and enhance applications' performance and scalability.
DBMSs consist of several key components, including the database engine, schema, data definition
language (DDL), data manipulation language (DML), data control language (DCL), and database
utilities. Key concepts in DBMS include data models, schemas, instances, keys, and normalisation.
DBMS applications span various domains, such as banking, telecommunications, education, and e-
commerce, where efficient data management is crucial. Understanding these components and
concepts is essential for leveraging the full potential of DBMSs in building robust and scalable
database systems.
7. GLOSSARY
A DBMS is software that interacts with end users, applications, and the
DBMS (Database database to capture and analyse data. It provides an interface for users to
Management - create, retrieve, update, and manage data efficiently. DBMSs ensure data
System) integrity, security, and consistency, allowing multiple users to interact
with the database concurrently.
Metadata - as the schema, tables, columns, data types, constraints, and relationships.
Metadata is stored in a data dictionary or catalog and is crucial for the
DBMS to manage and use the data effectively.
Backup - automated and are essential for data recovery and business continuity.
Depending on the backup strategy, they can be full, incremental, or
differential.
Database Engine - transaction management, and data storage, ensuring operations are
performed efficiently and accurately. The engine includes mechanisms for
indexing, caching, and ensuring ACID properties.
8. SELF-ASSESSMENT QUESTIONS
SELF-ASSESSMENT QUESTIONS – 1
Fill in the blanks:
1 What is a database?
a) A collection of interrelated stored data facilitating the requirements of several users within
one or more organisations.
b) A single table containing all the data for an organisation.
c) A software system designed for creating spreadsheets.
d) A programming language used to manage data.
2 What is the smallest identified unit of data that has value in the real world?
a) Record
b) Data item
c) File
d) Database
3 What is a record?
a) A collection of several records of a single type.
b) The smallest identified unit of data.
c) A group of related data items considered a single unit by an application.
d) A logically organised collection of data with some meaning.
4 Which of the following is an example of a record type?
a) Last name
b) Street address
c) Salesperson
d) ID number
5 What does a database system contain besides the data itself?
a) Only the data
b) Only metadata
c) Data and metadata that defines and describes the data and relationships between tables
d) Only application programs
6 Which characteristic of a database approach allows different users to have customised views
of the data?
a) Self-describing nature
b) Insulation between program and data
c) Support for multiple views of data
d) Control of data redundancy
7 What is the main software component of a database system?
a) Database
b) Data
c) Database Management System (DBMS)
d) Application programs
8 Which of the following is NOT a capability of most DBMSs?
a) Creating a file
b) Deleting data
c) Sorting or indexing data
d) Building computer hardware
9 What is the role of hardware in a DBMS environment?
a) Provides procedures for managing the database
b) Stores and retrieves data from the database
c) Contains physical and electronic devices used to store and access the database
d) Defines the structure of the database schema
10 Which component of DBMS is responsible for handling the overall computerised database?
a) Procedures
b) Software
c) Users
d) Data
11 What is metadata in a DBMS context?
a) Actual data stored in the database
b) Hardware used to store the database
c) Data about the data, including database schema and constraints
d) Users who manage and operate the DBMS
12 Who are the end-users in a database system?
a) Application programmers
b) Database administrators
c) Users responsible for authorising access to the database
d) People whose jobs require access to the database for querying and updating
13 What is a database schema?
a) A set of procedures for managing the database
b) A collection of data stored in the database
c) Data about the data stored in the database
d) The design of a database, including structure and relationships
14 What does a database instance refer to?
a) Data stored in the database
b) The design of the database schema
c) A snapshot of the database state at a specific time
d) Metadata stored in the database
15 What is a query in the context of a DBMS?
a) A request to obtain information from the database
b) A type of database schema
c) Data constraints applied to columns
d) A procedure for manipulating data
16 Which operation is NOT part of data manipulation in a DBMS?
a) Insertion
b) Deletion
c) Update
d) Sorting
17 What is the function of a database engine in a DBMS?
a) It manages user access to the database
b) It stores metadata about the database
c) It handles queries for data retrieval and manipulation
d) It defines the structure of the database schema
18 Which of the following is an example of a database application?
a) Programming languages
b) Operating systems
c) Internet browsers
d) Banking transactions
19 How did the Internet revolution impact database usage?
a) Reduced user access to databases
b) Increased indirect interaction with databases
c) Converted phone interfaces to Web interfaces for database access
d) Decreased the role of database vendor
20 What role do automated teller machines (ATMs) play in database usage?
a) They restrict direct user access to databases
b) They facilitate direct interaction with databases
c) They provide physical storage for database servers
d) They store metadata about the database
9. TERMINAL QUESTIONS
1. Explain the concept of a database with an example.
2. Why is a DBMS important in modern data management?
3. Provide an overview of a DBMS and its main components.
4. What are the representative applications of Databases? List them.
5. Define a Database.
6. List the characteristics of the Database Approach.
7. Differentiate between Database Schema vs. Database Instance.
8. Explain Metadata.
9. What do you mean by Data Manipulation
10. Define Data constraint.
10. ANSWERS
Answer 2: DBMS is a general-purpose software system that facilitates the processes of defining,
constructing, manipulating, and sharing databases among various users and applications (Refer to
section 4 for details)
Answer 6: The database approach's characteristics include data independence, efficient data access,
data integrity, and support for multiple users. (Refer to section 5.3 for details.)
Answer 7: A database schema defines the structure of a database, including tables, fields, and
relationships. In contrast, a database instance is a snapshot of the data stored in the database at a
particular moment. (Refer to section 5.3 for details.)
Answer 8: Metadata refers to data about the data stored in a database, including definitions,
constraints, and relationships. (Refer to section 5.3 for details.)
Answer 9: Data manipulation involves modifying the data stored in a database through operations
like insertion, deletion, and updating records. (Refer to section 5.3 for details.)
Answer 10: A data constraint is a rule defined on a database column that restricts the type or range
of data stored in that column. (Refer to section 5.3 for details.)
11. REFERENCES
• Fundamentals of Database Systems, Ramez Elmasri and Shamkant B. Navathe, 7th edition,
Pearson, 2016, ISBN-10: 0133970779.
APPLICATION
SEMESTER 2
DCA1208
DATABASE MANAGEMENT SYSTEM
Unit: 2 - Database Systems 1
DCA1208: Database Management System
Unit - 2
Database Systems
DCA324
KNOWLEDGE MANAGEMENT
Unit: 2 - Database Systems 2
DCA1208: Database Management System
TABLE OF CONTENTS
1 Introduction - -
4-5
1.1 Objectives - -
4 Summary - - 22
5 Glossary - - 23
7 Terminal Questions - - 27
8 Answers - -
8.1 Self-Assessment Questions - - 28-30
8.2 Terminal Questions - -
9 References - - 31
1. INTRODUCTION
In the previous unit, we learned that a database is a structured data collection organised for efficient
retrieval and management. It is a centralised repository where information is stored, organised, and
accessed according to defined relationships. The database approach emphasises structured data
storage, ensuring data integrity, security, and efficient querying.
Database Management System (DBMS) is software that facilitates database creation, manipulation,
and administration. It ensures data consistency, provides data security, and supports concurrent
access by multiple users. DBMS is crucial in modern information systems because it manages data
efficiently, improves data sharing, and supports decision-making processes.
Key components of DBMS include data definition, manipulation, and control. It comprises tables,
queries, forms, and reports to interact with data. Important concepts in DBMS include ACID
properties for transaction reliability and normalisation for data integrity.
DBMS finds application in various domains, from business operations (ERP, CRM) to scientific
research, enabling efficient data management, analysis, and retrieval for informed decision-making.
In this unit, we will understand the various types of database systems, which is crucial for gaining a
comprehensive knowledge of how data is organised, stored, and managed across different
applications and industries.
Learners can grasp the fundamental differences in data models, query languages, scalability, and use
cases by studying relational database management systems (RDBMS), NoSQL databases, object-
oriented databases, and others.
We will also be able to identify each type's strengths and weaknesses, recognise scenarios where one
type may be more suitable than others, and evaluate the impact of choosing a particular database
system on application performance and data integrity.
We will also explore the advantages and disadvantages of database management systems (DBMS),
equipping learners with a critical understanding of centralised data management's benefits and
potential challenges. Conversely, we will explore the drawbacks, including complexity in database
design, the potential for performance bottlenecks, and scalability limitations in traditional DBMS.
Moreover, we will assess the role of transaction management, concurrency control, and backup and
recovery mechanisms in ensuring data consistency and availability within a DBMS framework.
1.1. Objectives
• Identify the Types of Database Systems
• List the Advantages and Disadvantages of DBMS
• Outline the applications for the different types of
Database systems
Types of
Database
Systems
Centralised
Hierarchical Network Relational Distributed Object-Oriented NoSQL Database
Database
Databases Databases Databases Databases Databases Systems
Systems
In the given diagram, Electronics serves as the root node and has two child nodes, namely Televisions
and Portable Electronics. These two individuals have more children for whom they serve as parents.
For instance, television is represented by several types such as tube, LCD, and plasma, and these three
types can be considered as the parent forms of television. It exhibits a one-to-many relationship.
2.2. Network databases, a database management system, organise data using a network
model that allows each record to have multiple parent and child records, forming complex
relationships. Unlike hierarchical databases with a strict tree-like structure, network databases
permit nodes to have multiple parent nodes, facilitating more flexible and interconnected data
relationships.
However, relational databases have largely superseded them due to their complexity and difficulty
managing data integrity. Despite this, network databases still find niche applications where their
unique structure is advantageous, such as in certain engineering and telecommunications systems.
3. Scientific Research: Research domains such as biology, chemistry, and physics often deal with
highly interconnected data, such as gene interactions, chemical compounds, and experimental
results. Network databases can store and retrieve this data, supporting complex queries and
data analysis.
4. Knowledge Management Systems: Network databases may benefit systems that organise and
manage knowledge bases, such as libraries, archives, and digital repositories. They can handle
relationships between different types of knowledge entities, facilitating semantic queries and
information retrieval.
database management system that stores and provides access to data points related to one another.
RDBMSs use a structure that allows users to identify and access data with another piece of data in the
database. This type of system is based on the relational model, which organises data into tables (also
known as relations) consisting of rows and columns. Each table represents a collection of related data
entries, and each row in a table represents a unique record with a set of attributes defined by the
columns.
Examples
• Small to Medium Enterprises (SMEs): Many small businesses use centralised database
systems to simplify operations and lower operational costs.
• Legacy Systems: Older systems and applications designed before distributed architectures
often use centralised database systems.
2.5. Distributed databases are systems in which data is stored across multiple physical
locations, potentially spanning different geographical areas. This setup improves data availability,
reliability, and performance by decentralising the data storage process. Each node in a distributed
database can independently manage and update its subset of the data, ensuring consistency and
synchronisation through various protocols and algorithms.
• Scalability: Scales horizontally by adding more nodes to the distributed system, allowing it to
handle increasing volumes of data and requests.
6. IoT (Internet of Things): IoT applications generate vast amounts of data from interconnected
devices and sensors. Distributed databases support IoT platforms by collecting, storing, and
analysing real-time data streams from distributed endpoints, enabling insights for predictive
maintenance, smart city applications, and industrial automation.
Manag Employe
er e
management systems benefit from OODBs' ability to model complex network topologies and
relationships among network elements.
2.7. NoSQL (Not only SQL) database systems are a diverse category of databases
that depart from the traditional relational database management systems (RDBMS). They are
designed to address the limitations of RDBMS and cater to modern application requirements,
particularly those involving large-scale data storage and real-time web applications.
One of the main advantages of using a database system is that the organisation can exert centralised
management and control over the data through the DBA.
Any application that necessitates a change in the structure of a data record must make prior
arrangements with the DBA, who will make the appropriate changes.
1. Sharing Data: A database permits data sharing under the control of any number of users or
application programs.
2. Data Redundancy Reduction: Centralised data control by the DBA avoids undesirable data
duplication and effectively reduces the total data storage required.
a. It also reduces the need for further processing to find the relevant data in a huge amount of
data. Another benefit of minimising duplication is eliminating common discrepancies in
redundant data files. Any DBMS redundancies are managed, and the system assures
numerous copies of the same data are consistent.
3. Data Integrity: Centralised control can also ensure the DBMS incorporates sufficient checks to
facilitate data integrity. When discussing data integrity, we're referring to the data's accuracy.
The database's information is both accurate and consistent. As a result, data values entered for
storage could be double-checked to ensure they are within a specific range and in the correct format.
For example, consider the value for the age of an employee may be in the range of 16 and 75.
Another integrity check that should be incorporated into the database is to ensure that if there is a
reference to a certain object, that objectmust exist. In the case of an automatic teller machine, for
example, a user isnot allowed to transfer funds from a non-existent savings account to anexisting one.
4. Data Security: Data is vital for any organisation and may be confidential. Such confidential data
must be protected from unauthorised persons. As the owner of the data in the DBMS, the DBA
can ensure that suitable access protocols are followed, such as proper authentication schemas
for DBMS access and additional checks before granting access to sensitive data.
Multiple levels of security could be established for different types of data and procedures. Security
enforcement could be data value-dependent (e.g., a manager has access to the salary details of
employees in his or her department only) and data-type-dependent (e.g., the manager cannot access
the medical history ofany employees, including those in his/her department).
5. Conflict Resolution: Since the database is under the DBA's control, she/ he should resolve the
conflicting requirements of various users and applications. In other words, the DBA selects the
appropriate file format and access mechanism for response-critical applications while enabling
less important apps to continue using the database with a delayed response time.
4. SUMMARY
• Database systems vary widely based on structure, functionality, and intended use. Relational
Database Management Systems (RDBMS) store data in tables with rows and columns, offering
strong consistency and SQL-based querying. NoSQL databases, including document stores (like
MongoDB) and key-value stores (like Redis), provide unstructured and distributed data
flexibility.
• In-memory databases like Redis optimise performance by storing data in RAM. Distributed
databases such as Google Spanner ensure high availability and scalability across multiple
locations. Object-oriented databases store data as objects, facilitating complex relationships.
Each type caters to specific needs, balancing scalability, flexibility, and performance.
• DBMS offers several advantages, including centralised data management, ensuring data
integrity through ACID properties, and supporting concurrent access and data security. It
facilitates efficient data retrieval and manipulation, improving productivity and decision-
making.
• However, DBMS requires skilled management, may incur high initial costs for setup and
maintenance, and can be vulnerable to security breaches. It may also face scalability limitations,
particularly in distributed environments. Choosing the right DBMS involves balancing these
factors to meet specific application requirements while considering long-term scalability and
operational efficiency.
5. GLOSSARY
ACID - that ensure database transactions are processed reliably, even in the
presence of failures.
sales growth.
Normalisation - improve data integrity, typically by breaking down large tables into
smaller, related tables.
6. SELF-ASSESSMENT QUESTIONS
SELF-ASSESSMENT QUESTIONS – 1
Fill in the blanks:
1 What type of data storage does a Relational Database Management System (RDBMS) use?
a) Key-value pairs
b) Documents
c) Tables (relations)
d) Objects
2 Which property of RDBMS ensures reliability by guaranteeing transactions are processed
reliably?
a) ACID
b) BASE
c) CAP
d) JSON
3 Which database system is known for its geographical distribution across multiple physical
locations?
a) Object-Oriented Database
b) Relational Database Management System (RDBMS)
c) Distributed Database
d) Network Database
4 Which characteristic ensures fault tolerance in Distributed Databases?
a) ACID properties
b) Replication and Partitioning
c) Foreign Keys
d) Normalisation
5 Object-oriented databases store data as:
a) Key-value pairs
b) Documents
c) Tables (relations)
d) Objects
6 Which feature of Object-Oriented Databases supports complex data relationships like object-
oriented programming?
a) ACID Properties
b) Querying
c) Persistence
d) Inheritance and Polymorphism
7 Network Databases allow nodes (records) to have:
a) Multiple children only
b) Multiple parents only
c) Both multiple parents and children
d) No relationships
8 Centralised Database Systems are managed from:
a) Multiple servers
b) A single location or server
c) A cloud environment
d) Geographically distributed locations
9 NoSQL databases are primarily designed to address the limitations of:
a) Object-Oriented Databases
b) Centralised Databases
c) Relational Database Management Systems (RDBMS)
d) Network Databases
10 Which characteristics of NoSQL databases allow for efficient handling of large volumes of data
and high transaction rates?
a) ACID Properties
b) Horizontal Scalability
c) SQL Support
d) Schema Flexibility
11 Which type of NoSQL database stores data in flexible, JSON-like documents?
a) Key-Value Stores
b) Document Stores
c) Column Family Stores
d) Graph Databases
7. TERMINAL QUESTIONS
1. Define DBMS
2. Classify the types of Database systems
3. List the characteristics of RDMS
4. Identify a few examples of Distributed databases.
5. Write a short note on Object-oriented databases.
6. Differentiate Network databases vs. Centralised database systems.
7. Explain NOSQL database systems
8. List a few advantages of DBMS
9. Outline a few disadvantages of DBMS.
10. List the applications of Distributed databases.
8. ANSWERS
Answer 4: Examples of Distributed databases include Google Spanner, Amazon Aurora, Cassandra,
and MongoDB Atlas. Refer to section 2.2 for more details
Answer 5: Object-oriented databases store data as objects, encapsulating attributes and methods,
supporting complex data relationships and object-oriented programming principles like inheritance
and polymorphism. Refer to section 2.3 for more details
Answer 6: Network databases allow nodes (records) to have multiple parent and child nodes,
facilitating complex relationships. In contrast, centralised database systems store all data in a single
location, offering centralised control but potentially facing scalability challenges. Refer to section 2.4
for more details
Answer 7: NoSQL database systems depart from traditional relational databases by offering schema
flexibility, horizontal scalability, and high availability. They include document stores, key-value
stores, column family stores, and graph databases to handle diverse data types and large-scale
applications. Refer to section 2.6 for more details
Answer 8: Advantages of DBMS include data sharing, reduced data redundancy, data integrity
enforcement, centralised data security, efficient data management, and conflict resolution in multi-
user environments. Refer to section 3 for more details
Answer 9: Disadvantages of DBMS include complexity and cost of implementation, potential security
vulnerabilities, data integrity issues risks, backup and recovery processes complexity, and resource
consumption. Refer to section 3 for more details
regions for local access, disaster recovery, and compliance with data privacy laws. Refer to section
2.2 for more details
9. REFERENCES
• Fundamentals of Database Systems, Ramez Elmasri and Shamkant B. Navathe, 7th edition,
Pearson, 2016, ISBN-10: 0133970779.
APPLICATION
SEMESTER 2
DCA1208
DATABASE MANAGEMENT SYSTEM
Unit: 3 – Data Models 1
DCA1208: Database Management System
Unit - 3
Data Models
DCA324
KNOWLEDGE MANAGEMENT
Unit: 3 – Data Models 2
DCA1208: Database Management System
TABLE OF CONTENTS
1 Introduction - -
4-5
1.1 Objectives - -
5 Summary - - 24-25
6 Glossary - - 26-27
8 Terminal Questions - - 30
9 Answers - -
9.1 Self-Assessment Questions - - 31-33
9.2 Terminal Questions - -
10 References - - 34
1. INTRODUCTION
The previous unit deals with the database systems crucial for managing and organising data. There
are several types of database systems, each serving specific needs. Relational Database Management
Systems (RDBMS) store data in tables with rows and columns, facilitating easy querying and data
integrity.
We also understood how Distributed Databases spread data across multiple locations, enhancing
availability and reliability. Object-oriented databases store data as objects, supporting complex data
types and relationships. Network Databases utilise a graph structure to model relationships, allowing
for more flexible data associations. Centralised Database Systems store all data in a single location,
simplifying management but posing risks of a single point of failure. NoSQL Database Systems cater
to unstructured data and large-scale applications, offering high scalability and flexibility.
We also saw the advantages of a Database Management System, including improved data consistency,
security, and efficient data management. However, DBMS also have disadvantages, such as complexity,
cost, and the need for regular maintenance. Understanding the strengths and limitations of each
database system type helps select the right one for specific organisational needs.
This unit focuses on a data model, which is an integrated collection of concepts for describing data,
identifying relationships between data, and determining constraints on the data. A data model allows
us to treat a database as an abstract machine. In other words, we can concentrate on the principles
of design divorced from an immediate concern with implementation. Data models constitute formal
languages for defining data structures, declaring integrity, and manipulating data. A data model is a
mechanism for specifying the schema of some database.
We will also see how the data models establish the principles underlying DBMS. Every database, and
indeed every DBMS, must adhere to the principles of some data model. However, the term data model
is somewhat ambiguous. In the database literature, the term is used in several different senses, two
of which are the most important: that of architecture for data and that of an integrated set of data
requirements.
1.1. Objectives
After completing Unit 3, you should be able to:
• Define a Data Model.
• List the different types of Data Model.
• Differentiate between Hierarchical Model vs.
Network Model.
• Outline the overview of the Relational model.
This set of principles that define a data model may be divided into three major parts:
• Data definition – a set of principles concerned with data structure.
• Data manipulation – a set of principles concerned with how data isoperated upon.
• Data integrity – a set of principles concerned with determining which states are valid for a
database.
Data definition involves defining an organisation for data, a set of templates for the organisation of
data. Data manipulation concerns how the data is accessed and changed in the database. Data
integrity is very much linked with data manipulation in the sense that integrity concerns valid
changes and invalid changes todata.
A database based on the relational model developed by E.F. Codd allows the definition of data
structures, storage and retrieval operations, and integrity constraints. The data and their relations
are organised in tables in such a database. A table is a collection of records containing the same fields.
G
Primary
key
Tuples
(Rows)
Columns
(Attribute)
Each row in the relation represents a student, indicating that the row in the table describes a real-
world item. The columns of the table represent the attributes associated with the entity. For this
scenario, the information required includes the student's roll number, CGPA, and name.
This database consists of various components based on the relational model. These include:
• Relation: A two-dimensional table is used to store a collection of data elements.
• Tuple: Row of the relation, depicting a real-world entity.
• Attribute/Field: Column of the relation depicting properties that define the relation.
• Attribute Domain: A set of pre-defined atomic values that an attribute can take, i.e., it describes
the legal values that an attribute can take.
• Degree: It is the total number of attributes present in the relation.
In the diagram shown, electronics are the root node, with two subordinate nodes, televisions and
portable electronics. These two individuals have more children for whom they serve as parents. For
instance, television is represented by several types, such as tube, LCD, and plasma, and these three
types serve as the parents of television. It exhibits a one-to-many relationship.
Let's consider creating the network model for the student database. The Subject entity shows a
relationship with both the student and Degree entities. There is a connection between the Subject
entity and both the Student and Degree entities. The Subject entity has two parents, while the
remaining two entities have one child entity.
As an evolutionary technology, the object/relational (OR) approach has inherited its relational
ancestor's robust transaction and performance-management features and the flexibility of its object-
oriented cousin.
Database designers can work with familiar tabular structures and data definition languages (DDLs)
while assimilating new object-management possibilities. Query and procedural languages and call
interfaces in ORDBMSs are familiar: SQL3, vendor procedural languages, ODBC, JDBC, and proprietary
call interfaces are all extensions of RDBMS languages and interfaces. The leading vendors are well-
known: IBM, Informix, and Oracle.
The Figure represents an example of the Object-Relational Model (ORM), which blends the relational
database approach with object-oriented principles. It illustrates three interconnected tables:
"Customers," "Addresses," and "BankAccounts." The Customers table contains primary information
about customers (e.g., ID, surname, and first name). The Addresses table associates’ customers with
multiple locations using a foreign key (CID) that links to the ID in the Customers table, showcasing a
one-to-many relationship. Similarly, the BankAccounts table links multiple accounts to a single
customer using the same foreign key (CID). This structure effectively models complex real-world
entities, demonstrating how ORM encapsulates related data into hierarchical relationships, enabling
efficient data management and retrieval.
In contrast to a relational DBMS, where a complex data structure must be flattened out to fit into
tables or joined together from those tables to form thein-memory structure, object DBMSs have no
performance overhead to store or retrieve a web or hierarchy of interrelated objects. This one-to-one
mapping of object programming language objects to database objects has two benefits over other
storage approaches: it provides higher performance management of objects, and it enables better
management of the complex interrelationships between objects. This makes object DBMSs better
suited to support applications such as financial portfolio risk analysis systems, telecommunications
service applications, WWW (World Wide Web) document structures, design and manufacturing
systems, and hospital patient record systems, which have complex relationships between data.
As you can see, student and Department are two distinct objects. Every individual has unique qualities
and techniques. A relationship between the objects is established via their shared attribute Dept_no.
Components of the Object-Oriented Data Model, namely objects, classes, object attributes, class
hierarchy, etc., are explained as follows-
• Object—It is a physical or real-world entity. The object can also be called the ER model entity.
The object defines a single instance of an entity, which is known as an 'instance of a class. '
• Object attribute- The objects contain specific attributes. These are referred to as the object's
attributes.
• Object method- The object's behaviour is shown using object methods.
• Class- Certain characteristics are present in the objects. These characteristics are referred to as
the object's attributes.
• Inheritance—It is the ability of an object to inherit the attributes and methods of the classes
above it within the class hierarchy. A new class can be derived from an existing class. The new
class has the attributes and methods described in the existing class and its own attributes and
methods. This helps in code reusability.
• Encapsulation refers to the bundling of data (attributes) and the methods (operations or
functions) that manipulate the data into a single unit, known as an object. In this model, data is
not accessed directly; instead, interactions with the data occur through well-defined interfaces,
typically in the form of methods.
This encapsulation principle ensures that the object's internal representation is hidden from the
outside world, providing a layer of abstraction that safeguards the integrity of the data. For instance,
if an object represents a bank account, users can interact with the account balance only through
specific methods like deposit or withdrawal, preventing unauthorised or inappropriate access to the
underlying data.
Attribute Attribute
Relations
hip
Attribute Attribute
An ER diagram in DBMS defines entities, their corresponding properties, and their interconnections.
This helps represent the database's logical structure as shown above.
Key Features:
• Entities represent objects or concepts: Entities are fundamental units in a database that
represent real-world objects or concepts. Each entity is a distinct object or concept that can be
uniquely identified.
For example, In a student database, Students, courses, and professors could each be represented as
entities.
Entities typically correspond to tables in a relational database, where each row represents a specific
instance of the entity, and the columns represent the entity’s attributes.
For example, A relationship might exist between students and courses in a student database,
indicating which students are enrolled in which courses.
In a retail database, A relationship could link customers to orders, showing which customers made
which purchases.
For example, In a student entity, Attributes might include StudentID, Name, DateOfBirth, and Major.
Attributes in a product entity could include ProductID, ProductName, Price, and StockQuantity.
Attributes provide detailed information about each entity, making it possible to store and retrieve
specific data values. They are essential for defining a table’s schema and ensuring each entity instance
is fully described.
Bringing It All Together: In a well-designed database, entities, relationships, and attributes work
together to form a coherent data model. This model represents the structure of the data, its meaning,
and how different pieces of data relate to each other. By clearly defining these components, databases
can efficiently store, manage, and retrieve large amounts of structured information.
A database based on the relational model developed by E.F. Codd allows the definition of data
structures, storage and retrieval operations, and integrity constraints. The data and their relations
are organised in tables in such a database. A table is a collection of records; each record in a table
contains the same fields.
• Each Column has a Unique Name: Each column in a relational table must have a unique name
to distinguish it from other columns within the same table. Unique column names help
accurately identify and reference data elements within queries and operations, ensuring clear
and unambiguous access to the data.
The relational data model represents the database as a group of related tables. The relational data
model was introduced in 1970. It iscurrently the most popular model. The mathematical simplicity
and ease of visualisation of the relational data model have contributed to its success. The relational
data model is based on the mathematics of set theory, whosebasic components are the following.
Relation: A two-dimensional table. A relation is a collection of tuples containing values for a fixed
number of attributes. Relations are sometimes called at-files because they resemble an unstructured
sequence of records. Each tuple in a relation must be unique; there can be no duplicates.
A Relation may be defined in multiple ways. The Relation Schema R, denoted by R (A1, A2, An),
comprises a relation name R and is a list of attributes A1, A2, An.
For Example - CUSTOMER (Cust-id, Cust-name, Address, Phone#). Here, CUSTOMER is a relation
defined over the four attributes Cust-id, Cust-name, Address, and Phone#, each of which has a
domain or a set of valid values.
A tuple is an ordered set of values derived from an appropriate domain. Each row in the CUSTOMER
table may be referred to as a tuple and consists of four values.
<1759, "Rama Krishna", "101 Main 3rd Cross Manipal", "0820-2653487">is a tuple belonging to
the CUSTOMER relation.
A relation may be regarded as a set of tuples (rows). Columns in a table are also called attributes of
the relation.
A domain has a logical definition; for example, “India_Phone_numbers” is a set of 10-digit mobile
numbers. A domain may have a data type or a format defined for it. The India_Phone_numbers may
have a format: (ddd)- ddddddd where each d is a decimal digit. For example, dates can be in various
formats such as a month, name, date, year, or yyyy-mm-dd, or dd, mm,yyyy, etc.
An attribute designates the role played by the domain. E.g., the domainDate may be used to define
attributes “Invoice-date” and “Payment-date”. The relation is formed over the cartesian product of
the sets; each set has values from a domain, and that domain is used in a specific role, which is
conveyed by the attribute name. For example, attribute Cust-name is defined over the domain of
strings of 25 characters. The role these strings play in the CUSTOMER relation is that of the name of
the customers.
R is a relation schema, and r(R) is a specific "value" or population of R. Here, R is also called the
intention of a relation, and r is also called the extension of a relation.
Let S1 and S2 be domains, S1 = {0,1} and S2 = {a,b,c}. The Relation R canbe written as
R Ì S1 X S2
is one possible “state,” “population”, or “extension” r(R), defined over domains S1 and S2. It has three
tuples.
Characteristics of Relations
• Ordering of tuples in a relation r(R): The tuples are not considered to be ordered, even though
they appear in the tabular form.
• Ordering of attributes in a relation schema R (and of values within each tuple): We will
consider the attributes in R(A1, A2, ..., An) and the values in t=<v1, v2, ..., vn> to be ordered.
• Values in a tuple: All values are considered atomic (indivisible). A special null value represents
unknown or inapplicable values to certain tuples.
Key Constraints
Superkey of R: A set of attributes SK of R such that no two tuples in any valid relation instance r(R)
will have the same value for SK. That is, for any distinct tuples t1 and t2 in r(R), t1[SK] ≠ t2[SK].
Key of R: A "minimal" superkey; that is, a superkey K such that removal of any attribute from K results
in a set of attributes that is not a superkey.
It has two keys Key1 = {State, Reg#}, Key2 = {SerialNo}, which are also superkeys. {SerialNo,
Make} is a superkey but not a key.
If a relation has several candidate keys, one is chosen arbitrarily as theprimary key. The primary
key attributes are underlined.
Entity Integrity
Relational Database Schema: A set S of relation schemas that belong tothe same database. S is the
name of the database.
Entity Integrity: The primary key attributes PK of each relation schemaR in S cannot have null
values in any tuple of r(R). This is because primary key values are used to identify the individual tuples.
Referential Integrity
A constraint involving two relations (the previous constraints involve a single relation). Used to
specify a relationship among tuples in two relations, the referencing relation and the referenced
relation.
Tuples in the referencing relation R1 have attributes FK (foreign key attributes) that reference the
primary key attributes PK of the referenced relation R2. A tuple t1 in R1 is said to reference a tuple t2
in R2 if t1[FK] = t2[PK].
A referential integrity constraint can be displayed as a directed arc from R1 in a relational database
schema.FK to R2.
Attribute: A table column. Other commonly used terms for attributes are property and field. The set
of permissible values for each attribute is called the domain for that attribute.
Key: A single attribute or combination of attributes whose values uniquely identify the tuples of the
relation. Each row has a different value for the key attribute(s). The relational model requires that
every relation have a key and that:
• No two tuples may have the same key value and
• Every tuple must have a value for the key attribute (the key fields have non-null values).
There are two restrictions on the relational model that are sometimes circumvented in practice:
• Duplicate tuples are not permitted. If two tuples are entered with the same value for every
attribute, they are considered the same tuple.
• No ordering of tuples within a relation is assumed. In practice, however, one method of ordering
tuples is often used.
One of the main advantages of the relational model is that it is conceptually simple and, more
importantly, based on the mathematical theory of relation. It also frees the users from details of the
storage structure and access methods. The relational model, like all other models, consists of three
basic components:
• a set of domains
• a set of relations operation on relations
• integrity rules
In general, we say that a relation defined over n domains has a degree n or is n-ary. The elements of
this set are n-tuples. We shall distinguish between the definition of a relation and the relation itself.
We shall say that the definition of a relation gives a name to the relation and specifies the components
over which it is defined. These components are referred to as relation attributes or attributes for
short. An attribute has a domain associated with it from which it takes on values. The relation itself,
on the other hand, is the set of tuples which constitute it at a given instance of time.
For example, a statement that says that a relation supplier is built over attributes S#, P#, and SCITY
having domains integer and character string, respectively, is the definition of the relation supplier.
The relation itself is shown below. It must be noted at the time the definition of a relation is just given,
a relation with no tuples in it, i.e. a null relation, is just given, a relationwith no tuples in it, i.e. a null
relation, is created.
Supplier
S# P# SCITY
10 1 BANGALORE
10 2 BANGALORE
10 3 BANGALORE
11 1 BOMBAY
11 2 BOMBAY
5. SUMMARY
Let us summarise what we have studied in this unit:
Data models are fundamental frameworks used to structure and organise data within databases. They
provide a systematic approach to defining how data is stored, accessed, and manipulated. These
models serve as blueprints for constructing databases, guiding the database schema's design and
influencing the database system's integrity, performance, and scalability. Data models ensure that
data is logically organised and that relationships among data elements are clearly defined, facilitating
efficient data management and retrieval.
There are several types of data models, each with unique characteristics and applications. The
primary types include the Relational Data Model, Hierarchical Model, Network Data Model,
Object/Relational Model, Object-Oriented Model, and Entity-Relationship Model. Each model offers a
different perspective on how data can be structured and interconnected, catering to various use cases
and requirements. Understanding these models is essential for selecting the appropriate database
design that aligns with specific business needs and data processing demands.
The Relational Data Model, proposed by E.F. Codd, is the most widely used in database management
systems. It organises data into tables (relations) consisting of rows (tuples) and columns (attributes).
Each table represents a specific entity; relationships between tables are established through foreign
keys. This model emphasises data integrity and supports powerful querying capabilities using
Structured Query Language (SQL). Its simplicity, flexibility, and support for normalisation make it a
popular choice for various applications.
The Hierarchical Model arranges data in a tree-like structure with parent-child relationships, where
each child can have only one parent. This model efficiently represents one-to-many relationships but
can be rigid and complex for many-to-many relationships. On the other hand, the Network Data Model
allows more complex relationships by organising data as a graph with nodes and edges. It supports
many-to-many relationships and is more flexible than the hierarchical model but can be more
challenging to design and maintain.
The Object/Relational Model extends the relational model by incorporating object-oriented concepts,
allowing complex data types and object identity. The Object-Oriented Model treats data as objects,
like object-oriented programming, enabling inheritance, encapsulation, and polymorphism. This
model is suitable for applications requiring complex data representations and behaviours. The Entity-
Relationship (ER) Model visually represents data entities and their relationships, using ER diagrams
to illustrate the database structure. It provides a high-level conceptual view, aiding database design
and stakeholder communication.
Last is the relational model, which is a cornerstone of modern database systems, emphasising data
organisation in tables with well-defined relationships. It relies on principles such as normalisation to
reduce data redundancy and ensure data integrity. SQL is the primary language used to interact with
relational databases, enabling efficient data manipulation, querying, and transaction management.
The model's strength lies in its simplicity, robustness, and scalability, making it suitable for a wide
range of applications, from small-scale systems to large enterprise databases.
6. GLOSSARY
A data model defines the structure of a database and determines how data
is stored, organised, and manipulated. It includes the data types,
Data Model - relationships, and constraints that apply to the data. Examples include
The relational model organises data into tables (or relations) consisting
of rows and columns. Each table represents a different entity;
Relational Model - relationships between tables are established through foreign keys. This
model emphasises data integrity and allows for powerful query
capabilities using SQL.
Integrity constraints are rules that ensure the accuracy and consistency of
Integrity data within a database. They include primary keys, foreign keys, unique
- constraints, and check constraints. These constraints help maintain data
Constraints
quality and enforce business rules.
7. SELF-ASSESSMENT QUESTIONS
SELF-ASSESSMENT QUESTIONS – 1
Fill in the blanks:
1 ________ is a set of principles concerned with determining whichstates are valid for a
database.
2 In the Primitive data models approach, objects are represented by ___________structures
grouped in file structures.
3 In the __________data model, the database is represented as a group of related tables.
4 Each tuple in a relation must be__________________; that is, there can be no duplicates.
5 A tuple is an ________________set of values.
6 In relational integrity constraints, there are main types of constraints.
7 If a relation has several candidate keys, one is chosen arbitrarily to be the ________________key.
8 The set of permissible values for each attribute is called the ______________for that attribute.
9 A single attribute or combination of attributes whose values uniquely identify the ____________of
the relation.
10 The hierarchical data model organises data in a ________________ structure.
11 In a hierarchical database, the parent-child relationship is _______________.
12 The popularity of the network data model coincided with the popularity of the data model.
13 Object DBMSs add database functionality to ___________________programming languages.
14 A major benefit of the Object-Oriented data model approach is the _____________of the
application.
15 Which of the following is a property of relational tables?
a) Values are non-atomic
b) Each row is unique
c) The sequence of rows is significant
d) Columns can have duplicate names
16 Who developed the relational model for databases?
a) Charles Bachman
b) Edgar F. Codd
c) Peter Chen
d) Larry Ellison
17 In a relational database, what term describes a table?
a) Schema
b) Attribute
c) Relation
d) Domain
18 What is the primary key used in a relational table?
a) To store null values
b) To enforce referential integrity
c) To uniquely identify each tuple
d) To define the data type of each column
19 Which of the following is a characteristic of a superkey in a relational database?
a) It allows duplicate values for key attributes.
b) It ensures no two tuples have the same value for the key attributes.
c) It permits null values for primary key attributes.
d) It does not guarantee the unique identification of tuples.
8. TERMINAL QUESTIONS
1. Define a Data Model.
2. Classify the principles of the Data model.
3. List the types of data Model.
4. Explain the generations of architectural data models.
5. What is a Relational Model?
6. Compare & contrast the Hierarchical Model vs. the Network Model.
7. Outline the properties of the Object/Relational database management system.
8. Explain the Entity-relationship model.
9. Describe a relational model.
10. Explain relational schema
11. List the characteristics of relations.
12. What are Relational Integrity Constraints?
13. Differentiate between Entity Integrity vs. Referential Integrity.
14. Define a Tuple.
15. Distinguish between an attribute and a Key.
9. ANSWERS
Answer 2: The principles of data models include data abstraction, data independence, and data
integrity. Data abstraction simplifies database design, data independence ensures changes do not
affect applications, and data integrity maintains accuracy and consistency. Refer to section 2 for
more details.
Answer 3: Data models include hierarchical, network, relational, object-oriented, and entity-
relationship (ER) models. Each type has unique characteristics and is suited to different database
design requirements. Refer to section 3 for more details.
Answer 4: The generations of architectural data models evolved from file-based systems to
hierarchical and network models, followed by the relational model and, more recently, object-
oriented and NoSQL databases. Each generation addresses the limitations of the previous ones. Refer
to section 3 for more details.
Answer 5: The relational model organises data into tables, or relations, with rows and columns. It
uses keys to establish relationships between tables and allows for powerful data querying and SQL
manipulation. It emphasises data integrity and simplicity. Refer to section 3.1 for more details.
Answer 6: The hierarchical model organises data in a tree-like structure with parent-child
relationships, while the network model uses a graph structure with nodes and interconnected
relationships. Unlike the hierarchical model, the network model supports many-to-many
relationships. Refer to section 3.2 for more details.
Answer 7: An ORDBMS supports complex data types, inheritance, and encapsulation, blending
features of relational and object-oriented databases. It allows for storing and manipulating objects
and provides robust querying capabilities through SQL. Refer to section 3.4 for more details.
Answer 8: The ER model represents data using entities, attributes, and relationships. Entities are
objects, attributes are properties of entities, and relationships connect entities. It is used to design
database schemas conceptually before physical implementation. Refer to section 3.6 for more
details.
Answer 9: The relational model structures data into tables with rows and columns, using keys to
establish relationships. It ensures data integrity through constraints and allows for flexible querying
using SQL. It is widely used for its simplicity and robustness. Refer to section 4 for more details.
Answer 10: A relational schema defines the structure of a relational database, including tables,
columns, and the relationships between them. It specifies each column's data types and constraints
and is a blueprint for organising data. Refer to section 3.6 for more details.
Answer 11: Relations have unique table names, distinct rows (tuples), distinct column names
(attributes), atomic attribute values, and unordered rows and columns. These characteristics ensure
data consistency and integrity in a relational database. Refer to section 4 for more details.
Answer 112: Relational integrity constraints ensure data accuracy and consistency in a relational
database. They include primary keys, foreign keys, unique constraints, and check constraints, helping
to enforce business rules and maintain data quality. Refer to section 4 for more details.
Answer 13: Entity integrity ensures that each table has a unique primary key and that no part of the
primary key is null. Referential integrity ensures that foreign keys correctly reference primary keys
in related tables, maintaining consistent relationships between tables. Refer to section 3.6 for more
details.
Answer 14: A tuple is a row or record in a relational database table representing an instance of an
entity. It contains specific values for each attribute defined in the table schema, forming a complete
set of related data. Refer to section 3.6 for more details.
Answer 15: An attribute is a column in a database table representing a data field, such as name or
age. A key is a specific attribute or set of attributes used to uniquely identify rows in a table, such as
primary and foreign keys. Refer to section 3.6 for more details.
10. REFERENCES
1. Fundamentals of Database Systems, Ramez Elmasri and Shamkant B. Navathe, 7th edition,
Pearson, 2016, ISBN-10: 0133970779.
APPLICATION
SEMESTER 2
DCA1208
DATABASE MANAGEMENT SYSTEM
Unit: 4 – Introduction to the ER Model 1
DCA1208: Database Management System
Unit - 4
Introduction to the ER Model
DCA324
KNOWLEDGE MANAGEMENT
Unit: 4 – Introduction to the ER Model 2
DCA1208: Database Management System
TABLE OF CONTENTS
1 Introduction - -
4-5
1.1 Objectives - -
Concept of Entity-Relationship Model 1, 2, 3, 4, 5, 6,
2 7, 8, 9, 10, 11, -
12, 13, 14
2.1 Symbols Used in ER Model - -
2.2 Entities and Attributes 6-23
- -
2.3 Relationships - -
2.4 Keys - -
3 Summary - - 24-25
4 Glossary - - 26
6 Terminal Questions - - 32
7 Answers - -
7.1 Self-Assessment Questions - - 33-37
7.2 Terminal Questions - -
8 References - - 38
1. INTRODUCTION
The previous unit focused on Data models, which are frameworks for structuring and organising data.
They define how data is stored, organised, and manipulated. There are several types of data models,
each with unique characteristics. First, we saw the Relational Data Model, which Organises data into
tables (relations) with rows and columns. Each row represents a record, and each column represents
an attribute. It is widely used due to its simplicity and flexibility. Second is the Hierarchical Model:
This model structured the data in a tree-like format, where each child node has a single parent. It is
efficient for hierarchical data but lacks flexibility for more complex relationships. Third is the
Network Data Model, which is similar to the hierarchical model but allows multiple parent nodes to
form a graph structure. It is more flexible than the hierarchical model but can be complex to manage.
Fourth, we studied the Object/Relational Model, which extends the relational model by integrating
object-oriented concepts, allowing the storage of complex data types and structures. Fifth is the
object-oriented model, which represents data as objects, like object-oriented programming. This
model is beneficial for applications requiring complex data representations. Sixth was the Entity-
Relationship Model, which uses entities, attributes, and relationships to represent data. It is primarily
used for database design and conceptual data modelling. An overview of the relational model
emphasises using tables to represent data, focusing on data integrity and using keys to establish
relationships between tables.
In this unit, we will learn that a relational database is a collection of relations with distinct relation
names. The relational database schema is the collection of schemas for the relations in the database.
For example, a university database contains schemas for relations called Students, Faculty, Courses,
Rooms, Enrolled Numbers, Teachers, etc.
We will learn that an instance of a relational database is a collection of relation instances, one per
relation schema in the database schema; of course, each relation instance must satisfy the domain
constraints in its schema.
The entity-relationship model is a tool for analysing an application's semantic features independent
of events. This approach includes a graphical notation depicting entity classes as rectangles,
relationships as diamonds, and attributes as circles or ovals. For a complex situation, a partial entity-
relationship diagram may be used to present a summary of the entities and relationships that do not
include the details of the attributes.
Last, we will outline how the entity-relationship diagram provides a convenient method for
visualisingthe interrelationships among entities in each application and how this tool has proven
helpful in transitioning from an information application description to a formal database schema.
1.1. Objectives
After completing this Unit, you should be able to:
• Outline the Entity-Relationship Model
• Define an Entity.
• Explain an Attribute.
• Describe a Relationship.
• Classify the different types of Keys
The Entity-Relationship (ER) Model is a conceptual framework for data modelling and database
design. It provides a high-level view of the data structure by representing entities, objects or concepts
that can have data stored about them and the relationships between them. Each entity is
characterised by attributes that describe its properties, while relationships illustrate how entities
interact.
The basic terms that are used for describing important entity-relationship concepts are as follows:
For example, A student, an employee, a bank a/c, etc., are all entities.
ENAME
DEPT
In a Database Management System (DBMS), an entity has several key characteristics that define its
role and attributes within the database. These characteristics include:
1. Uniqueness: Each entity must be distinguishable from others within the database. This
uniqueness is typically achieved using a primary key, uniquely identifying each entity instance.
2. Attributes: Entities are described by attributes, which are properties or characteristics that
provide more details about the entity. For example, a "Person" entity might have attributes such
as Name, Age, and Address.
3. Entity Type: An entity is classified into an entity type or entity class, which groups similar
entities with shared attributes. For example, "Employee" or "Customer" are entities that group
entities with common attributes relevant to their respective domains.
4. Relationships: Entities can participate in relationships with other entities, which define how
entities interact with one another. Relationships establish the associations and dependencies
between different entities within the database.
5. Instance: An entity type includes multiple instances. Each instance represents a specific
occurrence of the entity type, such as a particular employee or customer, distinguished by
unique values for the entity's attributes.
6. Integrity Constraints: Entities are subject to integrity constraints to ensure data accuracy and
consistency. Constraints like primary keys, foreign keys, and unique constraints enforce rules
on how data is stored and related.
7. Identification: Each entity must be identifiable through a unique identifier or key. This key
ensures that each entity instance can be referenced and retrieved accurately from the database.
Entity Types
Strong Entity Types: Also known as regular entities, these have a unique identifier or primary key
that distinguishes each instance of the entity type. They do not rely on other entities for their identity.
Weak Entity Types: These are entities that do not have a unique identifier of their own and rely on
a strong entity (also known as an owner entity) for identification. Weak entities are identified by a
combination of their attributes and the primary key of the strong entity they depend on.
For example, OrderItem May rely on the Order entity to uniquely identify individual items within an
order.
Derived Entity Types: These are entities whose attributes can be derived from other entities. For
example, Age Could be derived from a Person entity's Date of Birth attribute.
A simple example of a Strong and weak entity: As seen below, we have a customer as the strong entity
and address as the weak entity, with respective attributes such as Name, Customer_id, Age, Gender,
Town, Locality, and State.
Customer id Locality
Strongly Entity
Type
Gend Name Week
er Entity Type
Entity Set:
An entity set is a collection of entities of the same type. It is essentially a grouping of all instances of a
particular entity type that share common attributes. Each entity in the set has its own unique values
for these attributes, but all entities within the set have the same attribute structure.
name
roll_no class
Student
b) Composite attribute: A composite attribute can be divided into smaller sub-parts, each
representing a more basic attribute with independent meanings.
Example: The address can be further split into house number, street number, city, state,
country, and pin code; the name can also be split into first name, middle name, and last name.
name
roll_no address
Student
c) Single attributes: Single attributes have a single value for a particular entity.
The attribute which takes up only a single value for each entity instance is a single-valued attribute.
name
roll_no DOB
Student
Student
e) Derived Attribute: A derived attribute's value can be derived from other attributes or entities.
Derived attributes are attributes that can be derived from other attributes. They are
represented by a dotted oval shape.
Example: Total and average marks of a student and employee age derived from date of birth.
Student
f) Domain: The conceptual definition of attributes is a named set of scalar values, all of the same
type, and a pool of possible values.
Example: The gender attribute might have a domain of {Male, Female, Other}.
Relationships help define how entities interact with each other, enabling the database to model real-
world scenarios effectively. For instance, consider the relationship between PERSONS and
AUTOMOBILES. This relationship could be described as "OWNS," indicating that automobiles are
owned by people.
Examples of Relationships
In this case, each person (an instance of the PERSONS entity) can own multiple automobiles
(instances of the AUTOMOBILES entity).
One person owns each automobile. This can be visualised in the database schema as:
Person ID Name
1 Raj
2 Vincent
Table.1 Relationship
Here, OwnerID in the AUTOMOBILES table is a foreign key referencing PersonID in the PERSONS table,
establishing the "OWNS" relationship.
In DBMS, relationships can be classified based on the number of entities involved. Binary and ternary
relationships are two such types:
Relationships
Binary Ternary
Binary Relationships: Binary relationships involve two entities. They are the most common type of
relationship in relational databases and describe how two entities are related to each other.
Ternary Relationships: A ternary relationship involves three entities. It is used when a binary
relationship cannot capture the complexity of the relationship between entities. A ternary
relationship connects three entities simultaneously and is typically represented using an associative
entity or relationship in an Entity-Relationship (ER) diagram.
Example: Consider an order system where the entities are Customer, Product, and Store. A ternary
relationship is needed if we want to track which customer bought which product from which store.
In this case, the relationship involves three entities, which cannot be fully expressed using binary
relationships.
Is-a Hierarchies
Is-a hierarchies are a special type of relationship that allows for attribute inheritance. This is
fundamental in object-oriented modelling and can also be applied in database design. In an Is-a
hierarchy, a subclass inherits attributes and relationships from its superclass.
In this scenario, An Automobile entity might have attributes such as Model and serial number. A Truck
is a specific type of Automobile, which means it inherits the Model and serial number attributes from
the Automobile entity.
2.4. Keys:
In a Database Management System (DBMS), keys are fundamental to ensuring data integrity and
enabling efficient data retrieval. Keys uniquely identify records in a table, establish relationships
between tables, and enforce various constraints. Here are the main types of keys used in DBMS. A key
is an identifier that uniquely differentiates one entity instance from all others in the entity
Types of Keys
Primary Key Candidate Key Super Key Foreign Keys Composite Key
1. Primary Key
A primary key is a unique identifier for a record within a table. Every table is limited to having a single
primary key, which is not allowed to have null values. It guarantees that every record is distinct and
can be identified independently.
1 Raj 21
2 Vincent 22
• Example: In an Employee table, both EmployeeID and Email can be candidate keys.
3. Super Key
A super key is a set of one or more attributes that can uniquely identify a record in a table. It includes
candidate keys and any other combination of attributes that uniquely identifies records.
• Example: In a Customer table, both {CustomerID} and {CustomerID, Email} can be super keys.
1 [email protected] Raj
2 [email protected] Vincent
• Example: In an Order table, CustomerID can be a foreign key referencing the CustomerID in
the Customer table.
5. Composite Key
A composite key is a primary key composed of two or more attributes. It is used when a single
attribute is not sufficient to uniquely identify a record.
STUDENT
Customer Table
If you have another table, like orders, the customer_id from the customer table could serve as a
foreign key in the orders table to establish a relationship between customers and their orders.
Orders Table
101 1 TV 1 2024-01-02
Foreign Key: customer_id, which Links to the customer_id in the customer table, establishing a
relationship between customers and their orders.
An Entity-Relationship (ER) schema is a conceptual framework for modelling the data structure of a
database. It uses ER diagrams to visually represent entities, their attributes, and the relationships
between them. An ER schema may identify certain constraints to which the data content must
conform.
Cardinality Specifies the number of instances of one entity that can or must be associated with each
instance of another entity.
a) One-One Relationship. For each entity instance in one entity, there is at most one
associated entity instance in the other entity.
For example, each person has one unique passport, and each passport is assigned to one person.
Tables:
• Person: PersonID, Name
• Passport: PassportID, PersonID, PassportNumber
1 1
Person Has Passport
b) Many-One Relationships. One entity instance in entity E2 is associated with zero or more
entity instances in entity E1, but each entity instance in E1 is associated with at most one entity
instance inE2.
• The Characteristics are as follows:
a. Each row in Table B can be linked to multiple rows in Table A.
b. Each row in Table A is linked to one and only one row in Table B.
For example, many employees work in one department, but each employee works in only
one department.
Tables:
• Employee: EmployeeID, Name, DepartmentID
• Department: DepartmentID, DepartmentName
m 1
Employees Hav Department
e
c) Many-Many Relationships There are no restrictions on how many entity instances in either
entity are associated with a single entity instance in the other.
An example of a many-to-many relationship would be students taking classes. Each student takes
many classes, and each class has many students.
m n
Class Has Students
Mapping cardinality is derived from cardinality constraints. The cardinality constraint between two
entities E1 and E2, denoted by (m,n), specifies that an instance in E1 appears in E2 at least m andat
most n times. Mapping cardinality takes the maximum number of cardinality constraints for each
entity in a relationship.
2. Existence dependence:
Existence dependence in a Database Management System (DBMS) refers to a situation where one
entity's existence is dependent on another entity's existence. In other words, an entity cannot exist
without being associated with another entity. This concept is often represented through weak
entities in an Entity-Relationship (ER) model.
1. Weak Entities:
o A weak entity is one that does not have a sufficient unique identifier on its own. Instead, it
relies on a "strong" or "owner" entity to provide a unique context.
o The primary key of a weak entity is typically a combination of its own partial key and the
primary key of the strong entity it depends on.
2. Owner Entity:
o The strong entity that provides the primary key for the weak entity is known as the owner
entity. The weak entity cannot exist without its owner entity.
3. Identifying Relationship:
o The relationship between the weak entity and the strong entity is called an "identifying
relationship." This relationship helps uniquely identify the weak entity.
o In ER diagrams, ◦ The identifying relationship is often represented by a double line
connecting the weak entity to the owner entity.
4. Partial Key:
o A weak entity typically has a partial key (also called a discriminator) that uniquely identifies
its instances in conjunction with the primary key of the owner entity.
3. SUMMARY
Let us summarise what we have studied in this unit:
• The Entity-Relationship (ER) model is a fundamental framework in database design that
helps conceptualise and structure data by representing entities, their attributes, and their
relationships. It provides a visual and conceptual way to model the data and its interactions
within a database system, facilitating a clear and organised approach to database design. The ER
model serves as a blueprint for creating a physical database schema, ensuring the data is
represented accurately and efficiently.
• Entities are the core components of the ER model, representing real-world objects or concepts
with distinct existence within the database. Each entity is characterised by its attributes, which
define its properties or characteristics. For example, in a university database, Student and
Course can be entities, with attributes like StudentID, Name, and CourseName describing their
respective properties. Entities can be simple or composite, and their attributes can include
primary keys uniquely identifying each entity instance.
• Relationships describe how entities interact with each other and establish connections
between them. The ER model depicts relationships as diamonds or lines connecting entities.
These relationships can be classified into various types based on their cardinality: one-to-one,
one-to-many, many-to-one, and many-to-many. For instance, a Student might be associated with
multiple Courses through an Enrolls relationship, while each course can have multiple students.
Relationships help understand how data is interrelated and facilitate efficient data retrieval and
manipulation.
• Keys are crucial elements in the ER model that ensure the uniqueness and integrity of data
within the database. The primary key uniquely identifies each record within an entity and
ensures that no two records are identical. In addition to primary keys, other types of keys
include candidate keys, which are potential primary keys, and foreign keys, which establish links
between related entities. Composite keys, consisting of multiple attributes, are used when a
single attribute is insufficient to identify a record uniquely. Keys are vital in maintaining data
accuracy and enabling effective data management.
• The Entity-Relationship model is an essential tool in database design that provides a structured
approach to representing data. Entities and attributes define the core components of the
database, relationships describe the interactions between entities, and keys ensure data
integrity and uniqueness. Together, these elements form a comprehensive framework for
designing, implementing, and managing databases, ensuring that the data is well-organised,
accurate, and easily accessible.
4. GLOSSARY
ER Model (Entity- A conceptual framework for data modelling that visually represents
Relationship - entities, their attributes, and their relationships. It is used to design and
Model) structure databases by defining the data and its interrelationships.
Attribute - it. For instance, an "Employee" entity might have attributes like
"EmployeeID," "Name," and "HireDate."
Management - manipulation. It provides tools for storing, retrieving, and managing data
System) efficiently.
Key - uniquely. Keys ensure that each record is distinct and can be accurately
retrieved or referenced.
Defines the number of instances of one entity that can be associated with
Cardinality - each instance of another entity. It specifies the maximum and minimum
number of occurrences in a relationship.
5. SELF-ASSESSMENT QUESTIONS
SELF-ASSESSMENT QUESTIONS – 1
Fill in the blanks:
1 Which of the following is a primary purpose of the Entity-Relationship (ER) model?
A. To design the physical database schema
B. To provide a high-level conceptual framework for data modelling
C. To ensure data encryption
D. To manage database performance
2 What distinguishes a strong entity from a weak entity in the ER model?
A. Strong entities rely on other entities for identification, while weak entities do not.
B. Strong entities have a unique primary key, while weak entities do not have a unique
identifier on their own.
C. Strong entities can be abstract concepts, while weak entities are always physical objects.
D. Strong entities cannot participate in relationships, while weak entities can.
3 In an ER diagram, which term describes a collection of entities of the same type?
A. Entity Type
B. Entity Instance
C. Entity Set
D. Entity Attribute
4 What is an example of a derived attribute in the ER model?
A. Employee ID
B. Date of Birth
C. Age
D. Name
5 Which type of key uniquely identifies each instance of an entity within an entity type?
A. Foreign Key
B. Candidate Key
C. Primary Key
D. Composite Key
6 What does the term ‘relationship’ refer to in the ER model?
C. Single Attribute
D. Multi-Valued Attribute
13 How is a composite attribute different from a simple attribute?
A. A composite attribute cannot be further divided, while a simple attribute can be.
B. A composite attribute can be divided into smaller sub-parts, while a simple attribute cannot
be subdivided.
C. A composite attribute has only one value, while a simple attribute can have multiple values.
D. A composite attribute is always derived from other attributes.
14 Which of the following is an example of a derived attribute?
A. Employee ID
B. Phone Number
C. Total Marks
D. Address
15 What does a foreign key do in a database?
A. Uniquely identifies a record within its own table
B. Establishes a relationship between two tables
C. Enforces unique constraints on a table
D. Defines a composite key
16 Which key type is used to uniquely identify a record in a table and cannot be null?
A. Foreign Key
B. Composite Key
C. Primary Key
D. Candidate Key
17 What is a composite key in the context of database keys?
A. A key that is formed by a single attribute
B. A key that is used to identify records within its own table
C. A primary key composed of two or more attributes
D. A key used to reference another table's primary key
18 In an Is-a hierarchy, what does a subclass inherit from its superclass?
A. Only the attributes
B. Only the relationships
C. Both attributes and relationships
B. Weak Entity
C. Derived Entity
D. Composite Entity
24 How is the identifying relationship between a weak entity and its owner entity represented in
an ER diagram?
A. By a single line
B. By a dashed line
C. By a double line
D. By a double oval
25 What role does the partial key play in a weak entity?
A. It uniquely identifies instances of the weak entity on its own.
B. It helps in linking the weak entity to the owner entity.
C. It defines the attributes of the weak entity.
D. It provides a unique identifier for the owner entity.
6. TERMINAL QUESTIONS
1. Define an Entity.
2. What is an attribute?
3. List the characteristics of an Entity.
4. Classify the entity types.
5. Differentiate between a Strong entity vs. a Weak Entity.
6. Explain an Entity set.
7. How is an entity type different from an entity instance?
8. List the Characteristics of an attribute.
9. Classify the types of attributes.
10. What is a derived attribute?
11. Define a domain in DBMS.
12. Explain Is a hierarchies with an example.
13. Identify the importance of keys in DBMS
14. Classify the types of keys.
15. How is a foreign key different from a composite key?
16. Define cardinality.
17. Write a short note on types of cardinalities.
18. What is Existence Dependence?
19. List the Characteristics of Existence Dependence.
20. Why Existence Dependence Matters. Justify
7. ANSWERS
Answer 5: A strong entity has a primary key that uniquely identifies each instance. In contrast, a weak
entity cannot be uniquely identified by its attributes alone and relies on a strong entity's primary key
for identification. For example, an "Order" (strong entity) can have multiple "OrderItems" (weak
entity) that depend on the "Order" for unique identification. Refer to section 2.1 for more details
Answer 6: An entity set is a collection of similar types of entities that share the same attributes. For
example, a set of all employees in a company forms an "Employee" entity set, where each employee
is an instance within the set. Refer to section 2.1 for more details
Answer 7: An entity type is a collection of entities that share the same attributes and define the
structure and properties of the entities. An entity instance is a specific, individual occurrence of an
entity type. For instance, "Employee" is an entity type, while "John Doe" is an instance of the
"Employee" entity type. Refer to section 2.1 for more details
Answer 10: A derived attribute is an attribute whose value is calculated or derived from other
attributes. For example, "Age" can be derived from the "Date of Birth" attribute. Refer to section 2.1
for more details
Answer 11: A domain is the set of allowable values that an attribute can take. It defines the
permissible values for a given attribute, such as {Male, Female, Other} for a "Gender" attribute. Refer
to section 2.1 for more details
Answer 12: Is-a hierarchies represent inheritance relationships where a subclass inherits attributes
and relationships from a superclass. For example, in a database, a "Truck" is a subclass of
"Automobile" and inherits attributes like "Model" and "Serial Number" from the "Automobile"
superclass. Refer to section 2.2 for more details
Answer 13: Keys are crucial in DBMS for ensuring data integrity, establishing relationships between
tables, and enabling efficient data retrieval. They uniquely identify records, enforce uniqueness, and
maintain referential integrity across tables. Refer to section 2.3 for more details
Answer 15: A foreign key is an attribute or a set of attributes in one table referencing the primary
key in another table, establishing a relationship between the tables. A composite key is a primary key
composed of two or more attributes used to identify a record within the same table uniquely. Refer
to section 2.3 for more details
Answer 16: Cardinality specifies the number of instances of one entity that can or must be associated
with each instance of another entity in a relationship. It defines the minimum and maximum
occurrences of entities involved in a relationship. Refer to section 2.3 for more details
Answer 18: Existence dependence refers to a situation where one entity's existence relies on the
existence of another entity. This is typically seen in weak entities that cannot be uniquely identified
without a strong or owner entity. Refer to section 2.3 for more details
Answer 20: Existence dependence is important for maintaining data integrity, normalisation, and
referential integrity. It ensures that weak entities are properly linked to strong entities, reducing
redundancy and ensuring that dependent data is accurately represented and managed within the
database. Refer to section 2.3 for more details
8. REFERENCES
1. Fundamentals of Database Systems, Ramez Elmasri and Shamkant B. Navathe, 7th edition,
Pearson, 2016, ISBN-10: 0133970779.
APPLICATION
SEMESTER 2
DCA1208
DATABASE MANAGEMENT SYSTEM
Unit: 5 – ER Diagrams 1
DCA1208: Database Management System
Unit - 5
ER Diagrams
DCA324
KNOWLEDGE MANAGEMENT
Unit: 5 – ER Diagrams 2
DCA1208: Database Management System
TABLE OF CONTENTS
1 Introduction - -
4-5
2.1 Objectives - -
5 Glossary - - 28-29
7 Terminal Questions - - 33
8 Answers - -
8.1 Self-Assessment Question - - 34-35
8.2 Terminal Questions - -
9 References - - 36
Unit: 5 – ER Diagrams 3
DCA1208: Database Management System
1. INTRODUCTION
In the previous unit, we studied the Entity-Relationship (ER) Model, exploring the fundamental
framework used in database design to represent data structures visually. We learnt how the ER model
helps define the data elements (entities) and their interrelationships, serving as a blueprint for
constructing databases. It simplifies the process of conceptualising and structuring data by using
diagrams to illustrate how different entities relate to one another. We also saw how this model is
especially useful during the initial stages of database design, as it provides a clear and organised way
to map out the data requirements before moving into more complex relational database schemas.
The ER model taught us that the primary components are crucial. Entities and Attributes refer to the
objects of interest in the database and their properties. Entities are usually nouns, such as "Student"
or "Course," while attributes provide additional details about these entities, like a student's name or
course credits. Relationships describe how entities interact with each other, indicating connections
like "Enrollment" between a student and a course. Keys are critical in the ER Model, with primary
keys uniquely identifying each entity instance, ensuring data integrity. Foreign keys are used to link
entities together, representing the relationships between them. Therefore, The ER Model provides a
clear, organised method for designing databases, ensuring the structure accurately reflects the real-
world scenario it intends to model.
In this unit, we will explore the foundational elements that make up an Entity-Relationship (ER)
diagram, which is a critical tool in database modelling. ER diagrams visually represent the structure
of a database by outlining entities, their attributes, and the relationships between them. We'll learn
the specifics of entities, objects or concepts that can be distinctly identified and their attributes, which
are properties or characteristics of these entities.
Moving on to The Entity-Relationship Model, this topic will cover the theoretical foundation of ER
diagrams, which includes the symbols used to represent entities, relationships, and attributes. We'll
learn about the standard notation for these components, such as rectangles for entities, diamonds for
relationships, and ovals for attributes. In the
section on Symbols Used in the E-R Model, we'll go deeper into the specific symbols and their
meanings, which are essential for correctly interpreting and creating ER diagrams. Following this, we
Unit: 5 – ER Diagrams 4
DCA1208: Database Management System
will study Some Examples of ER Diagrams to see how these concepts are applied in real-world
scenarios, providing practical insights into database design. Finally, the topic of Converting ER
Diagrams into Tables will teach us how to translate these visual representations into actual
database tables, a critical step in implementing a database design. This process involves mapping
entities to tables, attributes to columns, and relationships to foreign keys, ensuring that the database
structure accurately reflects the original ER diagram.
1.1. Objectives
By the end of the Unit 5, the learners should be able to
understand:
• Define the Entity-Relationship Model.
• Identify the Components of the ER Diagram.
• List the Symbols Used in the E-R Model.
• Explain how to Convert ER Diagrams into Tables
Unit: 5 – ER Diagrams 5
DCA1208: Database Management System
The entity-relationship model is a tool for analysing an application's semantic features independent
of events. This approach includes a graphical notation depicting entity classes as rectangles,
relationships as diamonds, and attributes as circles or ovals. For a complex situation, a partial entity-
relationship diagram may present a summary of the entities and relationships that do not include the
details of the attributes.
attention to the efficiency of the physical database design. The entity-relationship diagrams are then
turned into a logical schema in which the database is implemented.
2. Attributes: Attributes define a database's schema and are essential for ensuring data integrity,
retrieval, and manipulation within a DBMS. They describe the properties of entities and
relationships.
Types of Attributes in DBMS:
o Simple (Atomic) Attribute: An attribute that cannot be divided into smaller components.
Example: FirstName, Age, EmployeeID.
o Composite Attribute: An attribute that can be subdivided into smaller, meaningful
components, which themselves are attributes.
Example: Address might comprise Street, City, State, and ZipCode.
Unit: 5 – ER Diagrams 6
DCA1208: Database Management System
4. Keys: A key is an identifier that uniquely differentiates one entity instance from all others in the
entity.
a) Primary Key: Identifier used to uniquely identify one particular instance of an entity.
A primary key
• can be one or more attributes (e.g., consider substituting asingle concatenated key
attribute for multiple attribute keys)
• must be unique within the domain (not just the current data set),
• its value should not change over time,
• must always have a value, and
• is created when no prominent attribute exists. Each instance hasa value.
b) Candidate Key: When multiple possible identifiers exist, each is a candidate key.
c) Concatenated Key: A key comprises parts that, when combined, become a unique
identifier. Multiple attribute keys are concatenated keys.
d) Borrowed Key Attributes: If an is-a relationship exists, the key of the more general entity
is also a key of the sub-entities. For example, if the serial number is a keyfor automobiles, it
would also be a key for trucks.
Unit: 5 – ER Diagrams 7
DCA1208: Database Management System
e) Foreign Keys. Foreign keys reference a related table through the primary key of that
related table.
An ER schema may identify certain constraints to which the data content must conform.
Unit: 5 – ER Diagrams 8
DCA1208: Database Management System
• Weak Entity: Weak entities, which depend on another entity for their existence, are depicted
with double rectangles, and their identifying relationships are shown with double diamonds.
• Relationship: Relationships between entities are shown using diamonds. The diamond is
connected to the involved entities and describes the nature of the relationship.
• Identifying Relationship: This type of relationship, crucial for connecting weak entities to their
strong entities, is depicted using a double diamond. An identifying relationship indicates that
the weak entity's existence depends on the strong entity and that the weak entity's key is
partially or fully derived from the strong entity.
• Attributes: Attributes are properties or characteristics of entities and are depicted as ovals
connected to their respective entities. Variations include key attributes, which are underlined,
and multivalued attributes, which are depicted as double ovals.
• Key Attribute: A key attribute uniquely identifies each instance of an entity. It is underlined in
the diagram to distinguish it from other attributes.
Unit: 5 – ER Diagrams 9
DCA1208: Database Management System
• Derived Attribute: Represented by a dashed oval, a derived attribute is one whose value is
calculated or derived from other attributes.
• Cardinality: Cardinality is not always a separate symbol but is often represented alongside
relationships to indicate the nature of the relationship (e.g., one-to-one, one-to-many, or many-
to-many).
• Total Participation: Total participation indicates that every instance of an entity must
participate in a relationship. It is represented by a double line connecting the entity to the
relationship diamond.
• Structural Constraint: Structural constraints define the rules and limitations on the
relationships between entities, focusing on two main aspects: cardinality (which we’ve
discussed) and participation (total or partial). Cardinality constraints specify the maximum
Unit: 5 – ER Diagrams 10
DCA1208: Database Management System
• User Registration: Allows users to create accounts and manage their profiles, and User
Authentication implements secure login mechanisms to authenticate users.
• Event Creation: This feature enables event organisers to create and manage events. It also
allows users to search for events based on criteria such as location, date, category, and keywords.
Unit: 5 – ER Diagrams 11
DCA1208: Database Management System
• Ticket Booking: This allows users to book tickets for events. Ticket Printing and Delivery also
provide users with options to print or receive tickets electronically.
• Payment Processing: Payments are processed securely for ticket bookings. Payment
Confirmation also confirms users' ticket purchases and payment status.
• Event Management: Provide tools for event organisers to manage event details, such as dates,
venues, and ticket prices.
Unit: 5 – ER Diagrams 12
DCA1208: Database Management System
• event_id: The event this ticket is associated with (foreign key referencing the Event table).
• order_id: The order this ticket belongs to (foreign key referencing the Order table).
• price: The price paid for the ticket.
• type: It shows the type of ticket, for example, “General Admission” or “VIP.”
• seat_number: The seat number associated with this ticket (may be null for general
admission).
5. Order: This table stores information about user ticket orders.
• order_id: This is a unique identifier for each order, often an auto-incrementing integer value.
• user_id: The user who placed the order (foreign key referencing the User table).
• date: The date the order was placed.
• total_price: The total price of all tickets in the order.
• payment_id: The payment method used for the order (foreign key referencing the Payment
table).
6. Payment: This table stores information about payments made for orders.
• payment_id: This is a unique identifier for each payment, often an auto-incrementing integer
value.
• order_id: The order associated with the payment (foreign key referencing the Order table).
• payment_method: The method used for payment (e.g., “Credit Card”, “PayPal”).
• transaction_id: The unique identifier provided by the payment gateway for the transaction.
7. Speaker: This table stores information about speakers participating in events.
• speaker_id: This is a unique identifier for each speaker, often an auto-incrementing integer
value.
• name: The speaker’s full name.
• bio: A brief biography or description of the speaker.
• event_id: The event the speaker participates in (foreign key referencing the Event table).
Unit: 5 – ER Diagrams 13
DCA1208: Database Management System
This ER (Entity-Relationship) diagram represents a flight reservation system, showing how key
components such as airports, airplanes, flights, and reservations are interconnected. Let’s go step by
step to explain the entities, attributes, and relationships:
2. Airplane_Type:
o Attributes include Type_name, Max_seats, Company.
o Represents a classification of aeroplanes based on capacity and manufacturer.
3. Airplane:
o Attributesinclude Airplane_id, Total_no_of_seats.
o Represents specific airplanes, each belonging to a particular Airplane_Type.
Unit: 5 – ER Diagrams 14
DCA1208: Database Management System
4. Flight:
o Attributes: Number, Airline, Weekdays.
o Represents an overarching flight, which may consist of multiple legs.
5. Flight_Leg:
o Attributes: Leg_no, Scheduled_dep_time, Scheduled_arr_time.
o Represents a segment of a flight, such as one portion of a journey from one airport to another.
6. Fare:
o Attributes: Code, Amount, Restrictions.
o Represents the cost and restrictions of tickets for flights.
7. Leg_Instance:
o Attributes: Date, No_of_avail_seats.
o Represents a specific instance of a flight leg on a given date.
8. Reservation:
o Attributes: Customer_name, Cphone.
o Represents the details of a booking made by a customer.
9. Seat:
o Attributes: Seat_no.
o Represents a specific seat assigned to a customer as part of their reservation.
Unit: 5 – ER Diagrams 15
DCA1208: Database Management System
This ER Diagram represents a Hospital Management System and explains the relationships between
different entities in managing hospital operations, such as patients, doctors, nurses, rooms, bills, and
records. Let’s go step by step:
2. Doctor:
o Attributes: Dept (Department), Qualification.
o Represents doctors who consult patients.
3. Nurse:
o Inherits from Employee (has attributes E-ID, Name, Salary, Sex, Mob-No, Address, State, City,
Pin-no).
o Represents nurses who govern rooms and assist doctors.
4. Receptionist:
Unit: 5 – ER Diagrams 16
DCA1208: Database Management System
5. Rooms:
o Attributes: R-ID (Room ID), Type, Capacity, Availability.
o Represents rooms available in the hospital, assigned to patients.
6. Bills:
o Attributes: B-ID (Bill ID), Amount.
o Represents bills generated for patients.
7. Test Report:
o Attributes: Result, Test-Type, P-ID (Patient ID), R-ID (Room ID).
o Represents diagnostic or medical test results for patients.
8. Records:
o Attributes: Record-no, App-no.
o Represents records maintained by receptionists for patients’ history, appointments, and
other hospital-related activities.
Unit: 5 – ER Diagrams 17
DCA1208: Database Management System
Each diagram component is systematically mapped to database tables and columns when converting
an ER diagram into a relational database schema.
Mapping of N-ary
Relationship
Types
Unit: 5 – ER Diagrams 18
DCA1208: Database Management System
• A table is created for the weak entity, including its attributes and the primary key of the related
strong entity.
• The primary key of the weak entity table is a combination of its own attributes (if any) and the
primary key of the strong entity.
• Example: If you have a weak entity Dependent related to Employee, you create a Dependent
table with attributes like DependentName and EmployeeID, where EmployeeID comes from the
Employee table. The primary key would be a combination of DependentName and EmployeeID.
relationship (a department has many employees), add DeptID as a foreign key in the Employee table.
Unit: 5 – ER Diagrams 19
DCA1208: Database Management System
By following these steps, you can systematically map an ER diagram into a set of relational tables that
accurately represent the structure and constraints of the original model.
Consider the ER conceptual schema diagram for the COMPANY database as shown.
Unit: 5 – ER Diagrams 20
DCA1208: Database Management System
Step 1: Mapping of Regular Entity Types: we create the relations EMPLOYEE, DEPARTMENT, and
PROJECT in Figure 9.2 to correspond to the regular entity types EMPLOYEE, DEPARTMENT, and
PROJECT in Figure. The foreign key and relationship attributes, if any, are not included yet; they will
be added during subsequent steps. These include the attributes Super_ssn and Dno of EMPLOYEE,
Mgr_ssn and Mgr_start_date of DEPARTMENT, and Dnum of PROJECT. In our example, we choose Ssn,
Dnumber, and
Pnumber as primary keys for the relations EMPLOYEE, DEPARTMENT, and PROJECT, respectively.
The knowledge that the Dname of the DEPARTMENT and the Pname of the PROJECT are secondary
keys are kept for possible use later in the design.
The relations created from mapping entity types are sometimes called entity relations because each
tuple represents an entity instance. The result after this mapping step is shown below.
Unit: 5 – ER Diagrams 21
DCA1208: Database Management System
EMPLOYEE
DEPARTMENT
PROJECT
In this step, we create the relation DEPENDENT to correspond to the weak entity type DEPENDENT.
We include the primary key Ssn of the EMPLOYEE relation—which corresponds to the owner entity
type—as a foreign key attribute of DEPENDENT; we rename it Essn, although this is not necessary.
The primary key of the DEPENDENT relation is the combination {Essn, Dependent_name} because
Dependent_name is the partial key of DEPENDENT.
DEPARTMENT
Unit: 5 – ER Diagrams 22
DCA1208: Database Management System
Fig 8: Result of mapping the COMPANY ER schema into a relational database schema.
Unit: 5 – ER Diagrams 23
DCA1208: Database Management System
In our example, we create a relation DEPT_LOCATIONS. The attribute Dlocation represents the
multivalued attribute LOCATIONS of DEPARTMENT, while Dnumber—as foreign key—represents
the primary key of the DEPARTMENT relation. The primary key of DEPT_LOCATIONS is the
combination of {Dnumber, Dlocation}. A separate tuple will exist in DEPT_LOCATIONS for each
location that a department has.
Unit: 5 – ER Diagrams 24
DCA1208: Database Management System
Unit: 5 – ER Diagrams 25
DCA1208: Database Management System
4. SUMMARY
The Entity-Relationship (ER) Model is a fundamental conceptual framework used in database design
to visually represent data and their relationships within a system. At its core, the ER model helps
structure and define entities (real-world objects or concepts) and their relationships.
This model is widely used in the early stages of database design, providing a clear and concise way to
map out the data requirements and relationships in a system before implementation. The ER model
typically includes entities, attributes (characteristics of entities), and relationships, which define how
entities are connected. This model serves as a blueprint for creating a relational database schema.
ER diagrams are visual representations of the ER model with several vital components. The primary
components include entities, depicted as rectangles, and attributes, represented as ovals connected
to their respective entities.
Relationships between entities are shown as diamonds, with lines connecting them to the involved
entities. Relationships can have their own attributes, known as descriptive attributes. They may also
have cardinality constraints, which define the nature of the relationship (e.g., one-to-one, one-to-
many, or many-to-many). Other components include
primary keys, uniquely identifying each entity instance, and foreign keys, which establish connections
between different entities.
The ER model uses a set of standard symbols to represent various components and their interactions.
Rectangles usually symbolise entities, while attributes are shown as ovals attached to their respective
entities.
Diamonds represent relationships between entities, with lines connecting them to those involved.
Crow’s foot notation is often used to denote cardinality constraints, where a single line represents a
one-to-one relationship, a line with a crow’s foot denotes a one-to-many relationship, and a crow’s
foot at both ends indicates a many-to-many relationship. Double rectangles and double ovals
represent weak entities and multivalued attributes, respectively.
Once an ER diagram is completed, the next step in database design is converting the ER model into a
relational schema, where entities, attributes, and relationships are transformed into tables, columns,
and foreign keys. Entities in the ER diagram typically become tables, with their attributes as columns.
Unit: 5 – ER Diagrams 26
DCA1208: Database Management System
Relationships are converted into foreign keys, linking related tables or, in some cases, into separate
tables to handle many-to-many relationships. This process involves ensuring that all entities and
relationships are appropriately normalised to avoid redundancy and ensure data integrity.
Converting ER diagrams into tables is crucial in transitioning from conceptual design to actual
database implementation.
Unit: 5 – ER Diagrams 27
DCA1208: Database Management System
5. GLOSSARY
Constraints - operations allowed on the data, such as primary key constraints ensuring
unique identification or foreign key constraints maintaining referential
integrity.
Multivalued multiple values for a single entity instance. For example, a Person entity
- might have a multivalued attribute like Phone Numbers, where one
Attributes
person can have several phone numbers.
Regular Entity independently of other entities in a database. They have a primary key to
- identify each instance uniquely and are typically represented by a single
Types
rectangle in ER diagrams.
Unit: 5 – ER Diagrams 28
DCA1208: Database Management System
Primary Key - ensuring that no two records have the same key value. It is essential for
establishing relationships between tables and enforcing entity integrity.
A foreign key is an attribute in one table that links to the primary key of
another table, creating a relationship between the two tables. It ensures
Foreign Key - referential integrity by ensuring that the value in the foreign key column
Unit: 5 – ER Diagrams 29
DCA1208: Database Management System
6. SELF-ASSESSMENT QUESTIONS
SELF-ASSESSMENT QUESTIONS – 1
Fill in the blanks:
1 What is the primary function of an Entity-Relationship (ER) model in database design?
A) To define the physical storage structure of the database
B) To analyse an application's semantic features independently of events
C) To optimise query performance
D) To manage user access control
2 In an ER diagram, what symbol represents relationships between entities?
A) Rectangle
B) Oval
C) Diamond
D) Circle
3 Which of the following best describes a composite attribute?
A) An attribute that has multiple values for a single entity instance
B) An attribute that cannot be further subdivided
C) An attribute that is atomic and has no internal structure
D) An attribute that can be subdivided into smaller parts, such as an address
4 Which of the following is true about a primary key in an ER model?
A) It can have duplicate values within the same domain
B) It must always have a value and be unique within its domain
C) It can be composed of multiple values from different entities
D) It is only used to link to other tables in the database
5 What is a "borrowed key" attribute in the context of is-a hierarchies?
A) A key borrowed from another table to create a foreign key relationship
B) A key used to reference external databases
C) A key that is inherited from a more general entity in an is-a relationship
D) A key that is used temporarily during database transactions
6 What does mapping cardinality in an ER model indicate?
A) The types of attributes in an entity
Unit: 5 – ER Diagrams 30
DCA1208: Database Management System
B) The number of instances in one entity associated with instances in another entity
C) The total number of entities in the database
D) The hierarchical relationship between entities
7 Which of the following best describes a one-to-one relationship?
A) One entity instance in E1 is associated with many instances in E2
B) Many instances in E1 are associated with one instance in E2
C) Each instance in one entity is associated with at most one instance in another entity
D) Many instances in E1 are associated with many instances in E2
8 How is a weak entity represented in an ER diagram?
A) By a single rectangle
B) By a double rectangle
C) By a diamond
D) By a double diamond
9 What is the purpose of identifying relationships in an ER model?
A) To indicate a derived attribute
B) To connect multivalued attributes to an entity
C) To link weak entities to their corresponding strong entities
D) To establish a one-to-many relationship between two entities
10 Which symbol in an ER diagram represents a multivalued attribute?
A) Oval
B) Double oval
C) Rectangle
D) Dashed oval
11 What is the first step in converting an ER diagram into a relational database schema?
A) Mapping of Binary 1:1 Relationship Types
B) Mapping of Weak Entity Types
C) Mapping of Regular Entity Types
D) Mapping of Multivalued Attributes
12 How are weak entities typically mapped in a relational database?
A) By creating a table with its own primary key
B) By combining it with a strong entity into a single table
C) By creating a table without any foreign key references
Unit: 5 – ER Diagrams 31
DCA1208: Database Management System
D) By creating a table that includes the primary key of the related strong entity as part of its
primary key
13 What approach maps a binary many-to-many (M) relationship in a relational database?
A) Creating two tables, one for each entity involved in the relationship
B) Adding a foreign key in one of the existing tables
C) Creating a new table that includes the primary keys of the related entities as foreign keys
D) Using a single table with composite primary keys
14 When mapping an ER diagram to a relational schema, which step involves creating a new table
for multivalued attributes?
A) Step 3: Mapping of Binary 1:1 Relationship Types
B) Step 5: Mapping of Binary M Relationship Types
C) Step 6: Mapping of Multivalued Attributes
D) Step 7: Mapping of N-ary Relationship Types
15 When mapping a binary one-to-many (1) relationship, where is the primary key of the "one"
side entity added?
A) In the table corresponding to the "many" sides entity as a foreign key
B) In the table corresponding to the "one" side entity as a foreign key
C) In a separate junction table
D) As a composite key in both related tables
Unit: 5 – ER Diagrams 32
DCA1208: Database Management System
7. TERMINAL QUESTIONS
Unit: 5 – ER Diagrams 33
DCA1208: Database Management System
8. ANSWERS
Unit: 5 – ER Diagrams 34
DCA1208: Database Management System
Answer 3: Simple attributes are indivisible, atomic values like a person's first name, while composite
attributes can be broken down into smaller components, such as an address, subdivided into street,
city, and zip code. Refer to section 2.1 for more details.
Answer 4: Is-a hierarchies represent inheritance relationships in databases, where a subtype entity
inherits attributes and relationships from a supertype entity. This hierarchy allows for shared
attributes among related entities, modelling a generalised-to-specific relationship. Refer to section
2.1 for more details.
Answer 5: Primary keys are unique identifiers for each record in a database table. They ensure that
each record can be uniquely distinguished from all others, are mandatory, and should never change
over time. Refer to section 2.1 for more details.
Answer 6: Borrowed key attributes refer to primary keys from a strong entity inherited by a weak
entity in an "is-a" relationship. These borrowed keys help uniquely identify instances of the weak
entity by combining them with its own attributes. Refer to section 2.1 for more details.
Answer 7: A candidate key is any attribute or combination of attributes that can uniquely identify a
record in a table, while a concatenated key specifically refers to a key formed by combining multiple
attributes to create a unique identifier. Refer to section 2.1 for more details.
Answer 8: Key symbols in an ER model include rectangles for entities, ovals for attributes, diamonds
for relationships, double rectangles for weak entities, double ovals for multivalued attributes, and
underlined ovals for key attributes. Refer to section 2.2 for more details.
Answer 9: Steps include mapping regular entities to tables, mapping weak entities by including
foreign keys from strong entities, mapping binary 1:1 and 1 relationships by adding foreign keys,
mapping M relationships using junction tables, and handling multivalued and N-ary relationships
with separate tables. Refer to section 3 for more details.
Answer 10: To convert an ER diagram for the COMPANY database into tables, map each entity to a
table with attributes as columns. Include primary keys, map weak entities by adding foreign keys,
handle 1:1 and 1 relationships by adding foreign keys, create junction tables for M relationships, and
map multivalued attributes into separate tables. Refer to section 3 for more details.
Unit: 5 – ER Diagrams 35
DCA1208: Database Management System
9. REFERENCES
1. Fundamentals of Database Systems, Ramez Elmasri and Shamkant B. Navathe, 7th edition,
Pearson, 2016, ISBN-10: 0133970779.
2. Database Management Systems, Raghu Ramakrishnan and Johannes Gehrke,3rd Edition,
McGraw-Hill Education,2002 (for the 3rd Edition), ISBN-13: 978-0072465631
3. Database System Concepts, Abraham Silberschatz, Henry F. Korth, S. Sudarshan,7th Edition,
McGraw-Hill Education, 2019, ISBN-10: 0073523321
Unit: 5 – ER Diagrams 36
BACHELOR OF COMPUTER APPLICATION
DCA1208: Database Management System
SEMESTER 2
DCA1208
DATABASE MANAGEMENT SYSTEM
Unit: 6 – Introduction to the Relational Model 1
DCA1208: Database Management System
Unit - 6
Introduction to the Relational Model
DCA324
KNOWLEDGE MANAGEMENT
Unit: 6 – Introduction to the Relational Model 2
DCA1208: Database Management System
TABLE OF CONTENTS
1 Introduction - -
4-5
1.1 Objectives - -
3 Terminologies - - 16
6 Summary - - 25
7 Glossary - - 26
9 Self-Assessment Questions - 1 30
10 Answers - -
10.1 Self-Assessment Questions - - 31-33
10.2 Terminal Questions - -
11 References - - 34
1. INTRODUCTION
In the previous unit, we explored the Entity-Relationship (ER) Model, a conceptual framework used
for database design to represent data and its relationships. We studied the key components of ER
diagrams, including entities (objects), attributes (properties of entities), and relationships
(associations between entities). The symbols used in the ER model were discussed, such as
rectangles for entities, diamonds for relationships, ovals for attributes, and lines to connect them.
Practical examples of ER diagrams demonstrated their application in real-world scenarios, like
designing a student database or inventory system. Finally, we learned how to convert ER diagrams
into relational tables, a critical step for implementing a database by mapping entities to tables,
attributes to columns, and relationships to foreign keys, ensuring the logical structure aligns with the
relational model. This unit provides a foundation for effective database design and implementation.
In the current unit, we see the relational model concepts foundational to database management
systems. We begin by exploring the components of a relation, including attributes, tuples, domains,
and their roles in structuring data in a table format. Using examples, we illustrate how relations
represent real-world entities and examine a relation's mathematical and logical representation. Key
properties of a relation, such as uniqueness, atomicity, and order irrelevance, are discussed to
understand their role in maintaining data consistency. Additionally, the unit covers schema and
instance, emphasising the distinction between the structural definition of a relation and its dynamic
content. A detailed exploration of integrity constraints, including entity and referential integrity,
highlights how databases enforce data accuracy. Finally, the advantages of the relational model, such
as simplicity, data independence, and scalability, are outlined to emphasise its effectiveness in
modern database systems.
We also study essential terminologies like attributes, cardinality, and degree, which are critical for
understanding relational databases. The unit introduces Codd's 12 Rules, a set of guidelines for
evaluating relational database systems, focusing on data accessibility, integrity, and independence.
The discussion on keys in the relational model highlights their importance in uniquely identifying
data and maintaining relationships between tables. Types of keys, including a primary, foreign,
candidate, and composite keys, are explained with examples to show their role in ensuring data
integrity. Through these topics, the unit provides a comprehensive understanding of the relational
model's structure, rules, and benefits, forming the basis for effectively managing data in a database.
1.1. Objectives:
• Discuss the Concepts of a relational model
• List the advantages of a relational model
• Explain the Codd's Rules
• Classify the different keys in a relational model
A relation is a table-like structure that organises data into rows and columns:
1. Rows (Tuples): Represent individual records or instances of data.
2. Columns (Attributes): Represent the data's properties, features, or characteristics.
3. Domain: Specifies the permissible set of values for each attribute in the table.
o Example:
▪ In the table Student, the attributes are StudentID, Name, Age, and Major.
▪ These represent the properties of the student entity.
• Tuples (Rows): Each row in a table is called a tuple. Tuples represent individual data entries or
instances.
o Example:
▪ In the Student table, the first tuple represents one student's record (101, Alice, 20,
Computer Sci).
• Domain: The domain of an attribute specifies the range of permissible values for that attribute.
o Example:
▪ StudentID: Must be an integer.
▪ Name: A string with a maximum of 50 characters.
▪ Age: An integer between 18 and 25.
1. Attributes:
o The table contains four attributes:
▪ StudentID: A unique identifier for each student.
▪ Name: The student's name.
▪ Age: The student's age.
▪ Major: The student's area of study.
2. Tuples:
o Each row represents a tuple:
▪ (101, Alice, 20, Computer Sci)
▪ (102, Bob, 21, Mathematics)
▪ (103, Carol, 22, Physics)
3. Domain:
o StudentID: Integer values.
o Name: Strings of up to 50 characters.
o Age: Integers in the range of 18 to 25.
o Major: Strings from a predefined set of values, e.g., {"Computer Sci", "Mathematics",
"Physics"}.
The key properties include Uniqueness, Atomicity, and Order Irrelevance, explained as follows:
Properties of a
Relation
1. Uniqueness: This property ensures that no two tuples (rows) in a relation can have identical
values across all attributes. Each tuple must be distinguishable from others, typically enforced
through a primary key or unique constraints.
o A primary key uniquely identifies each tuple in a relation.
o Duplicate rows in a table violate the relational model's rules, leading to redundancy and
inconsistency.
This example repeats the tuple (101, Alice, 20, Computer Science), which violates the uniqueness
property because StudentID, the primary key, must uniquely identify each student. To resolve this,
duplicate rows must be removed, or the data must be corrected.
2. Atomicity: The atomicity property dictates that each relationship value must be indivisible. In
other words, each attribute should hold a single atomic value rather than a composite or nested
set of values. This ensures data consistency and simplifies query processing.
o Atomic values mean data cannot be broken down further for meaningful interpretation.
o The use of non-atomic values (e.g., storing multiple names in a single cell) violates this
property and complicates database operations like searching and filtering.
In this case, the second tuple violates the atomicity property because the Name attribute contains a
composite value (Bob and Carol). To adhere to atomicity, the data should be split into separate rows
or managed using relationships across multiple tables:
Correct Representation:
3. Order Irrelevance: The relational model ignores the order of tuples (rows) and attributes
(columns) in a relation. The data's meaning remains unchanged regardless of how rows or
columns are arranged.
o Tuples in a relation represent unordered sets of data.
o Columns (attributes) are logically defined, and their order in the table does not affect the
relation's semantics.
o This property aligns with the relational model's mathematical foundation, where sets are
inherently unordered.
Although the order of rows and columns differs between Table A and Table B, the relations are
equivalent. Query results and the underlying data remain unchanged.
A schema is fixed for a database unless explicitly modified. It provides the metadata describing the
table's format and constraints, serving as a guide for storing and managing data.
o Basis for Operations: Queries and transactions rely on the schema to understand the data
structure.
In this instance, StudentID values are unique (per the primary key constraint). Age values are valid
integers within the specified range. Major values are strings chosen from the predefined domain.
Example:
The two most fundamental types of integrity constraints are Entity Integrity and Referential
Integrity.
Types of integrity
constraints
Referential
Entity Integrity
Integrity
1. Entity Integrity: The Entity Integrity Constraint ensures that every table (relation) in a
database has a primary key, and the values of this primary key must be:
1. Unique: No two rows can have the same value for the primary key.
2. Not NULL: Every row must have a value for the primary key.
This constraint guarantees that each row is uniquely identifiable, ensuring the uniqueness of records
in the table.
Another invalid row would be (101, Eve, 24, Chemistry) because the primary key value 101 already
exists in the table, violating the Unique rule.
2. Referential Integrity: The Referential Integrity Constraint ensures that a foreign key in one
table corresponds to a primary key in another table, thereby enforcing the validity of
relationships between them.
An Example:
• Student Table:
• Enrollment Table:
StudentID in the Enrollment table is the foreign key referencing the primary key in the
Student table.
1. Every StudentID in the Enrollment table must exist in the Student table.
o For example:
▪ StudentID = 101 in the Enrollment table exists in the Student table, so the relationship
is valid.
2. If an invalid StudentID (e.g., 104) is inserted into the Enrollment table, it would violate the
referential integrity constraint because StudentID = 104 does not exist in the Student table.
3. TERMINOLOGIES
The relational model uses precise terminologies to describe its structure and components.
Understanding these terms is essential for effectively designing, managing, and querying relational
databases. This section expands on basic and advanced terms with examples and explanations.
• Attribute: A property or characteristic of an entity represented as a column in a table. Each
attribute describes specific data for the entity.
• Domain: The set of valid values that an attribute can hold. For example, the domain of a Gender
attribute could be {Male, Female, Other}.
• Cardinality: A table's total number of rows (tuples). It represents the count of data records in a
relation.
• Degree: A table's total number of columns (attributes). It indicates the structure or schema of
the relation.
• NULL Value: A placeholder indicating missing, unknown, or inapplicable data in a table. It is not
equivalent to zero or an empty string.
4. CODD'S RULES
Edgar F. Codd, the founder of the relational database model, proposed 12 rules in 1985 to evaluate a
Database Management System's (DBMS) adherence to the principles of the relational model. These
rules serve as benchmarks to assess whether a DBMS can be considered "relational" and provide a
framework for maintaining data consistency, integrity, and usability.
High-Level
View Updating
Insert, Update,
Comprehensive Rule
and Delete
Data Physical Data
Sublanguage Independence
Rule
Systematic
Integrity
Treatment of
Independence
NULL Values
Guaranteed Distribution
Access Rule Independence
1. Information Rule: All data in a database must be stored in tables (relations), with rows
representing tuples and columns representing attributes. This ensures a clear and consistent
structure for data representation.
An Example: A student table with rows as student records and columns such as StudentID, Name,
Age, and Major. Non-tabular data like XML or JSON would violate this rule in its raw form.
2. Guaranteed Access Rule: All data in a relational database must be accessible using a table name,
primary key, and attribute name. There should be no ambiguous or "hidden" data.
An Example: To retrieve the student's name with StudentID = 101, use a query like.
3. Systematic Treatment of NULL Values: The relational model must support NULL values
representing missing or unknown information and treat them consistently. NULL values must
not affect data integrity or produce ambiguities during operations.
An Example: A NULL in the Phone column of a Contact table indicates the phone number is
unavailable or unknown.
4. Dynamic Online Catalog: Metadata (data about the database structure) must be stored in
tables, allowing it to be queried and manipulated using SQL.
5. Comprehensive Data Sublanguage Rule: The DBMS must support a comprehensive and
universal language for data operations, including data definition (DDL), manipulation (DML),
and control (DCL). SQL is the most widely used sublanguage.
7. High-Level Insert, Update, and Delete: The DBMS must support high-level, set-based
operations for inserting, updating, and deleting data rather than operating on individual records.
8. Physical Data Independence: Changes to the physical storage structure or mechanisms (e.g.,
moving data files) must not affect the logical schema or applications accessing the database.
For example, moving data files from one storage device to another should not impact queries or
schema definitions.
9. Logical Data Independence: Changes to the logical schema (e.g., adding new columns) must
not disrupt existing applications or queries.
For example, Adding a GPA column to the Student table should not affect queries retrieving
StudentID or Name.
10. Integrity Independence: The DBMS must define and enforce all integrity constraints, such as
primary key, foreign key, and domain constraints, not at the application level.
11. Distribution Independence: A DBMS must provide a uniform interface for accessing data,
regardless of whether the data is stored locally or distributed across multiple locations.
It should return consistent results whether the data is stored on a single server or distributed across
multiple nodes.
12. Non-Subversion Rule: There must be no way to bypass the relational integrity constraints
defined within the DBMS, even for low-level operations.
For example, Even administrators should be unable to insert duplicate primary key values using
backend tools or system commands.
Types of Keys
Primary Key Candidate Key Super Key Foreign Key Composite Key
1. Primary Key: A Primary Key is a unique, non-null attribute (or set of attributes) uniquely
identifying each tuple in a relation. Every table must have exactly one primary key.
Rules to be followed:
o Must be unique for all tuples in the table.
o Cannot contain null values.
o There can be only one primary key per table.
Syntax:
EmployeeID is the primary key, ensuring each employee has a unique identifier.
2. Candidate Key: A Candidate Key is an attribute or a set of attributes that can serve as a primary
key. It satisfies the uniqueness and non-null constraints. A table can have multiple candidate
keys, but only one is chosen as the primary key.
Any of these attributes can uniquely identify a student, but only one is selected as the primary key.
3. Super Key: A Super Key is any attribute that uniquely identifies a tuple. It may include
additional attributes beyond what is necessary.
All candidate keys and the primary key are subsets of the super key.
EmployeeID, {EmployeeID, Name}, and {EmployeeID, DepartmentID} are all super keys, as they
uniquely identify tuples. However, only EmployeeID is minimal and qualifies as a candidate key.
4. Foreign Key: A Foreign Key is an attribute in one table that refers to the primary key in another
table, establishing a relationship between the two. It enforces referential integrity, ensuring the
foreign key value exists in the parent table.
Syntax:
DepartmentID in the Employee table is a foreign key referencing the DepartmentID primary key in
the Department table.
5. Composite Key: A Composite Key is a primary key of two or more attributes uniquely
identifying a tuple. It is used when a single attribute is insufficient to ensure uniqueness.
Syntax:
An Example:
The combination of StudentID and CourseID uniquely identifies each enrollment record.
6. SUMMARY
The Relational Model is a cornerstone of modern database systems, organising data into structured
relations (tables) with rows and columns. The components of a relation include attributes
(columns), tuples (rows), and domains (permissible values for attributes), which form the building
blocks of a relation. Using a practical example of a relation, we examined how entities like students
or employees can be represented in tables, while the representation of a relation involves its
mathematical notation, defining its structure and content. We also discussed the properties of a
relation, such as uniqueness, atomicity, and order irrelevance, which ensure consistent and logical
data representation. Lastly, the distinction between schema (the blueprint of a table) and instance
(the data at a specific point) was introduced to clarify relational databases' dynamic and static aspects.
Integrity constraints play a vital role in maintaining data consistency and correctness. These include
entity integrity, ensuring primary key uniqueness and non-null values, referential integrity, and
enforcing valid relationships between tables via foreign keys. The advantages of the relational
model were highlighted, such as its simplicity, scalability, data independence, and support for
powerful query languages like SQL. These features collectively make the relational model a widely
adopted standard for designing, implementing, and managing databases.
Critical terminologies in the relational model, such as attributes, tuples, domains, cardinality, and
degree, were explained, along with advanced concepts like primary, candidate, composite, and foreign
keys. These terminologies provide a foundation for understanding how relations are structured and
how relationships between tables are established. NULL values were also discussed to handle
missing or unknown information in a database.
Finally, Codd's 12 rules were reviewed as benchmarks for evaluating relational database
management systems, emphasising data independence, integrity enforcement, and comprehensive
support for SQL operations. Keys in the relational model, including primary, candidate, super,
composite, and foreign keys, were detailed to explain how they uniquely identify tuples and establish
inter-table relationships. These topics provide a comprehensive understanding of the relational
model, its structure, advantages, and principles, forming the foundation for effective database design
and management.
7. GLOSSARY
Relation - (attributes), where each row represents a record, and each column
represents a property.
Relational Model - predefined schemas, supporting operations using relational algebra and
ensuring consistency.
Integrity Rules that maintain the accuracy and consistency of data, such as primary
- keys, foreign keys, and unique or null constraints.
Constraint
8. SELF-ASSESSMENT QUESTIONS
SELF-ASSESSMENT QUESTIONS – 1
Fill in the blanks:
1 Which of the following is NOT a component of a relation?
a) Attribute
b) Tuple
c) Domain
d) File
2 What does a tuple in a relation represent?
a) A column in the table
b) A property of the entity
c) A single row or record in the table
d) A set of permissible values for an attribute
3 Which of the following best defines a domain in a relational model?
a) The number of rows in a table
b) The number of attributes in a table
c) The set of permissible values for an attribute
d) The primary key of the table
4 What is the primary difference between schema and instance in a relational model?
a) Schema refers to the structure, and instance refers to the actual data.
b) Schema changes frequently, and instance is fixed.
c) Schema refers to rows, and instance refers to columns.
d) Schema is dynamic, and instance is static.
5 Which property of a relation ensures that all attribute values are indivisible?
a) Uniqueness
b) Cardinality
c) Atomicity
d) Integrity
6 What is the purpose of a primary key in a relation?
a) Integer
b) String
c) Float
d) Date
14 Which of the following is one of Codd's 12 Rules for a relational database?
a) Data must be stored in a hierarchical format.
b) A foreign key must always reference a composite key.
c) Data must be stored in tables, and metadata should be accessible as tables.
d) Each table must contain at least one NULL value.
15 Which of the following keys uniquely identifies each tuple in a relation?
a) Foreign Key
b) Primary Key
c) Composite Key
d) Candidate Key
9. TERMINAL QUESTIONS
1. Define a relation
2. List the key components of a relation in the relational model.
3. How is a relation mathematically represented?
4. Outline the properties of a relation.
5. Differentiate between schema and instance in the relational model.
6. Define integrity constraints in the relational model.
7. Identify the advantages of the relational model.
8. Define an attribute in the relational model.
9. Discuss a domain in the relational model.
10. List the Codd’s 12 rules with examples
11. Elaborate on the keys used in a relational model
12. Design a table for an Employee database with attributes and specify the primary key.
13. Using SQL, create a table with a composite key.
14. Write a query to find all students studying 'Mathematics'.
15. Write an SQL query to retrieve all tuples from a table where the Age attribute is greater than 20.
10. ANSWERS
Answer 3: A relation R is represented as R (A1, A2,..., An), where A1, A2,..., An are the attributes. The
schema defines the structure, while the instance is the set of tuples representing the data. For example,
Student(StudentID, Name, Age). Refer to section 2.2 for more details.
Answer 5: The schema defines a table's structure, including its attributes and data types. The
instance is the actual data stored in the table at a specific moment. While the schema is static, the
instance is dynamic and changes with data operations. Refer to section 2.5 for more details.
Answer 6: Integrity constraints are rules that ensure the accuracy and consistency of data in a
database. Examples include:
• Entity Integrity: Primary keys must be unique and not null.
• Referential Integrity: Foreign keys must match primary keys in related tables. Refer to
section 2.6 for more details.
Answer 8: An attribute is a column in a table representing a data property. For example, student ID,
Name, and Age are attributes in a Student table. Each attribute has a specific domain defining
permissible values. Refer to section 3 for more details.
Answer 9: A domain is a set of valid values an attribute can hold. For example, the domain of an Age
attribute could be integers between 18 and 25. It ensures data consistency and validity within the
relation. Refer to section 3 for more details.
Answer 10: Information Rule: Data must be stored in tables. An example is a Student table, which
organises data. Guaranteed Access: Access data using the table name and primary key. Systematic
Null Handling: NULL values must be handled consistently. Refer to section 4 for more details.
Answer 11: Primary Key: Uniquely identifies each tuple. Example: StudentID. Foreign Key: Links
two tables. Example: DepartmentID in Employee table. Composite Key: Combines multiple
attributes for unique identification. Example: {StudentID, CourseID} in Enrollment. Refer to section
5 for more details.
Answer 12:
Answer 13:
Answer 14:
Answer 15:
11. REFERENCES
1. Fundamentals of Database Systems, Ramez Elmasri and Shamkant B. Navathe, 7th edition,
Pearson, 2016, ISBN-10: 0133970779.
2. Database Management Systems, Raghu Ramakrishnan and Johannes Gehrke,3rd Edition,
McGraw-Hill Education,2002 (for the 3rd Edition), ISBN-13: 978-0072465631
3. Database System Concepts, Abraham Silberschatz, Henry F. Korth, S. Sudarshan,7th Edition,
McGraw-Hill Education, 2019, ISBN-10: 0073523321.
SEMESTER 2
DCA1208
DATABASE MANAGEMENT SYSTEM
Unit: 7 – Relational Algebra 1
DCA1208: Database Management System
Unit - 7
Relational Algebra
DCA324
KNOWLEDGE MANAGEMENT
Unit: 7 – Relational Algebra 2
DCA1208: Database Management System
TABLE OF CONTENTS
1 Introduction - -
4-5
1.1 Objectives - -
2 Overview - - 6
Basic Operations of Relational Algebra 1, 2, 3, 4, 5, 6,
3 -
7, 8
3.1 Select - -
3.2 Project - -
7 – 20
3.3 Union - -
3.4 Set Difference - -
3.5 Cartesian Product - -
5 Summary - - 30
6 Glossary - - 31
7 Self-Assessment Questions - 1 32
8 Terminal Questions - - 33 - 34
9 Answers - -
9.1 Self-Assessment Questions - - 35 – 38
9.2 Terminal Questions - -
10 References - - 39
1. INTRODUCTION
In the previous unit, we saw the foundational concepts of the relational model, a framework for
managing and organising data using relations, commonly known as tables. The relational model was
first introduced by Edgar F. Codd in 1970 and is based on the mathematical concept of a relation,
where data is represented in tuples (rows) within a schema (table structure). Each table consists of
columns that define the attributes of the relation, and each tuple within a table represents a unique
record. The simplicity and mathematical rigour of the relational model make it a powerful tool for
database management, ensuring data integrity and reducing redundancy.
We also explored the terminologies and key concepts integral to understanding the relational
model, including Codd's 12 rules defining a relational database. These rules emphasise the
importance of data independence, integrity, and a systematic approach to database management. The
concept of keys in the relational model was also discussed, including primary keys, which uniquely
identify tuples in a table, and foreign keys, which establish relationships between tables. Keys play a
crucial role in maintaining the relational integrity of the database by ensuring that each record can
be accurately retrieved and associated with other relevant data.
In this unit, we will learn the foundational operations of relational algebra, a critical component of
relational database management systems (RDBMS). The basic operations we'll cover include the
Select operation, which filters rows from a relation based on a specified predicate, and the Project
operation, which creates a new relation by selecting certain columns from an existing relation. We
will also explore the Union operation, which combines the results of two relations into a single
relation, and the Set Difference operation, which yields tuples present in one relation but not in
another. Finally, the Cartesian Product will be examined, where every tuple in one relation is paired
with every tuple in another, forming a new relation.
Moving beyond the basic operations, we will study the derived operations of relational algebra, built
upon the foundational concepts. These include the division operation, which is helpful for queries
that involve "all" conditions, and the intersection operation, which returns the common tuples
between two relations. Also, we will cover the Join operation, one of the most powerful and frequently
used operations in relational databases, which combines tuples from two relations based on a related
attribute. Together, these derived operations expand the expressive power of relational algebra,
allowing for more complex and nuanced queries to be formulated and executed within an RDBMS
1.1. Objectives
By the end of this Unit, the learners should be able to
understand:
• Understand Relational algebra and its history
• Discuss the basic operations of relational algebra
• Classify the derived Operations in relational
algebra
2. OVERVIEW
Relational algebra received little attention until the publication of E.F. Codd's relational model of data
in 1970. Codd proposed such an algebra as a basis for database query languages. The first query
language to be basedon Codd's algebra was ISBL, and this pioneering work has been acclaimed by
many authorities as having shown how to make Codd's idea into a functional language. Even the query
language of SQL is loosely based on relational algebra. However, the operands in SQL (tables) are not
exactly relations, and several useful theorems about relational algebra do not hold in the SQL
counterpart (arguably to the detriment of optimisers and/or users).
3. BASIC OPERATIONS
Relational algebra is a formal language used to query and manipulate the data stored in relational
databases. It provides a set of operations that take one or more relations (tables) as input and produce
a new relation as output. These operations are fundamental to database management and form the
theoretical foundation for most database query languages, including SQL.
Basic Operations
CARTESIAN
SELECT PROJECT UNION SET DIFFERENCE
PRODUCT
SELECT, PROJECT, UNION, SET DIFFERENCE, and CARTESIAN PRODUCT serve a distinct purpose in
data retrieval and manipulation.
Basic operations are the traditional set operations: union, difference, intersection, and Cartesian
product. Three basic operations- union, intersection, and difference – require that operand relations
be union-compatible. Two relations are union-compatible if they have the same parity and one-to-
one correspondence of the attributes with the corresponding attributes defined over the same
domain. The Cartesian product can be defined on any two relations. P and Q are union compatible if
both P and Q are of the same degree n and the domain of the corresponding n attributes are identical,
i.e., if P= { P1,..,Pn } and Q={ Q1,...Qn } then Dom(Pi) = Dom(Qi) for i = { 1,2...n } where Dom(Pi)
Id Name represents the domain of the attribute Pi.
101 Jones
Example 1
103 Smith
In the following examples, we utilise two relations, P and
104 Lalonde
Q, given in Figure 2. R is a computed result relation. We
107 Evan
assume that the relations P and Q in Figure 2 represent
110 Drew
112 Smith
employees working on developing software application packages J1 (say) and J2 (say), respectively.
Id Name
103 Smith
104 Lalonde
106 Byron
110 Drew
The SELECT operation can also be visualised as a horizontal partition of the relation into two sets of
tuples—those tuples that satisfy the condition and are selected and those tuples that do not satisfy
the condition and are discarded.
Where the symbol σ (sigma) is used to denote the SELECT operator and the selection condition is a
Boolean expression (condition) specified on the attributes of relation R. Notice that R is generally a
relational algebra expression whose result is a relation—the simplest expression is just the name of a
database relation. The relation resulting from the SELECT operation has the same attributes as R.
The Boolean expression specified in <selection condition> is made up of a number of clauses of the
form:
where <attribute name> is the name of an attribute of R, <comparison op> is normally one of the
operators {=, <, ≤, >, ≥, ≠}, and <constant value> is a constant value from the attribute domain. Clauses
can be connected by the standard Boolean operators and, or, and not to form a general selection
condition.
For example, to select the tuples for all employees who either work in department 4 and make over
$25,000 per year or work in department 5 and make over $30,000, we can specify the following
SELECT operation:
Notice that all the comparison operators in the set {=, <, ≤, >, ≥, ≠} can apply to attributes whose
domains are ordered values, such as numeric or date domains. Domains of strings of characters are
also considered to be ordered based on the collating sequence of the characters. If the domain of an
attribute is a set of unordered values, then only the comparison operators in the set {=, ≠} can be used.
The SELECT operator is unary; it is applied to a single relation. Moreover, the selection operation is
applied to each tuple individually; hence, selection conditions cannot involve more than one tuple. The
degree of the relation resulting from a SELECT operation—its number of attributes—is the same as
the degree of R. The number of tuples in the resulting relation is always less than or equal to the
number of tuples in R. That is, |σc (R)| ≤ |R| for any condition C. The fraction of tuples selected by a
selection condition is referred to as the selectivity of the condition.
Hence, a sequence of SELECTs can be applied in any order. In addition, we can always combine a
cascade (or sequence) of SELECT operations into a single SELECT operation with a conjunctive (AND)
condition; that is,
In SQL, the SELECT condition is typically specified in the WHERE clause of a query.
SELECT *
FROM EMPLOYEE
WHERE Dno=4 AND
Salary>25000
Suppose we want to find those employees in the relation PERSONNEL of Figure 3 with an ID less than
105. This operation selects only some of the tuples in the relation. Such an operation is known asa
selection operation. The projection operation yields a vertical subset of a relation. The action is
defined over a subset of the attribute names but the tuples in the relation overall. The action is defined
over a subset of the attribute names but the tuples in the relation overall.
The selection operation, however, yields a horizontal subset of a given relation, i.e., the action is
defined over a complete set of attribute names. Still, only a subset of the tuples are included in the
result. To have a tuple included in the result relation, it must satisfy the specified selection conditions
or predicates. The selection operation is sometimes known as the restriction operation.
Results of Selection
PERSONNEL
The selection operation may specify any finite number of predicates connected by Boolean operators.
The predicates may define acomparison between two domain-compatible attributes or between an
attribute and a constant value; if the comparison is between attribute A1 andconstant c1, then c1
belongs to Dom (A1).
Given a relation P and a predicate expression B, the selections of those tuples of relation P that satisfy
the predicate B is a relation R written as:
R= B (P)
The above expression could be read as "select those tuples t from P in which the predicate B(t) is true".
The set of tuples with R is, in this case, defined as follows:
R= { t | t to P B (t) }
Name, Department, and Salary, applying PROJECT to select Name and Department will yield
a new relation with just these two columns.
3. Removes Duplicates: PROJECT automatically eliminates duplicate tuples in the resulting
relation. This is because, in relational algebra, the result of any operation is a set, and sets do not
contain duplicate elements.
4. Changes Relation Structure: Depending on the projected number of columns, the resulting
relation may have fewer attributes than the original. However, the number of rows may remain
the same or decrease if duplicates are removed.
5. Attribute Renaming: In some cases, the PROJECT operation can be combined with RENAME to
give new names to the attributes in the resulting relation, improving clarity or avoiding name
conflicts in complex queries.
where π (pi) is the symbol used to represent the PROJECT operation, and <attribute list> is the desired
sublist of attributes from the attributes of relation R. Again, notice that R is, in general, a relational
algebra expression whose result is a relation, which in the simplest case is just the name of a database
relation. The result of the PROJECT operation has only the attributes specified in <attribute list> in
the same order as they appear in the list. Hence, its degree equals the number of attributes in
<attribute list>.
The projection of a relation is defined as a projection of all its tuples over some set of attributes, i.e.,
it yields a vertical subset of the relation. The projection operation is used to either reduce the
number of attributes in the resultant relation or to reorder attributes. The relation's arity (or
degree) is reduced in the first case. The projection operation is shown graphically in Figure 4. The
figure below shows the projection of the relation PERSONNEL to the attribute Name. The cardinality
of the result relation is also reduced due to the deletion of duplicate tuples.
Id Name Name
101 Jones Jones
103 Smith Smith
104 Lalonde Lalonde
106 Byron Byron
107 Evan Evan
110 Drew Drew
112 Smith Smith
Fig 4: Projection of relation PERSONNEL over attribute Name
We defined the projection of a tuple ti over the attribute A, denoted ti[A] or A(ti), as (a), where a
is the value of tuple ti over the attribute A. Similarly, we define the projection of a relation T, denoted
by T[A] or A(ti), on the attribute A. This is defined in terms of the projection for each tuple in ti
belonging to T on the attribute A as:
T[A] = { ai | ti [A] = ai ti T}
where T[A] is a single attribute relation and | T[A] | £ T. The cardinality T[A] may be less than the
cardinality |T| because of the deletion of any duplicates in the result. A case in point is illustrated in
Figure 4.
Similarly, we can define the projection of a relation on a set of attribute names, X, as a concatenation
of the projections for each attribute A in X for every tuple in the relation.
T[X] = { ti[A] | ti T }
Here, A belongs to X where ti[A] represents the concatenation of all ti[A] for all A X (A belongs to X).
Simply stated, the projection of a relation P on the set of attribute names Y belongs to P is the
projection of each tuple of the relation P on the set of attribute names Y.
Note that the projection operation reduces the arity if the number of attributes in X is less than the
arity of the relation. The projection operation may also reduce the cardinality of the result relation
since duplicate tuples are removed. (Note that the projection operation produces a relation as the
result. By definition, a relation cannot have duplicate tuples. In most commercial implementations of
the relational model, however, the duplicates would still be present in the result).
Example:
Imagine a relation called Employees that has the following
attributes:
Suppose you are only interested in the Name and Department of each employee. You would use the
Project operation to select these columns:
Projection Query:
Result:
Here, the Project operation π reduces the original table by selecting only the Name and Department
columns. Note that even though Alice and Dana both work in the IT department, their entries remain
distinct in this projected view since other attributes (like EmployeeID and Salary) that might
differentiate them are not considered in this operation. If you needed a unique set of names and
departments, you could incorporate a distinct operation if the relational algebra system supports it.
3.3. Union
The UNION operation in relational algebra combines the tuples (rows) of two relations into a single
relation. It is a set-based operation that merges two sets of tuples and automatically removes
duplicates, ensuring that the result remains a set of unique tuples.
If we assume that P and Q are two union-compatible relations, then P and Q are the set-theoretic
union of P and Q shown in Fig 5.
P: Q:
The resultant relation, R = P U Q, has tuples drawn from P and Q such that
R = { t | P v t Q }
The result relation R contains tuples in either P or Q or both. The duplicate tuples are eliminated.
Remember that from our definition of union compatibility, the degree of the relations P and R is the
same. The cardinality of the resultant relation depends on the duplication of tuples in P and Q. From
the above expression, we can see that if all the tuples in Q were contained in P, then |R| = |P| and R=
P, while if the tuples in P and Q were disjoint, then |R| = |P|+|Q|.
Example 2
R, the union of P and Q, is given in the figure below. R represents employees working on the packages
J1 or J2, or both. Since a relation does not have duplicate tuples, an employee working on both J1 and
J2 will appear in the relation R only once.
R:
Id Name
101 Jones
103 Smith
104 Lalonde
106 Byron
107 Evan
110 Drew
112 Smith
Fig 6.: P Q
2. Set-Based: Like other set operations in relational algebra, SET DIFFERENCE operates on sets of
tuples. The result is a set of tuples in the first relation but not the second.
3. Asymmetric: The SET DIFFERENCE operation is not commutative, meaning that the order of
the relations matters.
For example, A − B will not yield the same result as B − A.
4. Finds Unique Tuples: The resulting relation contains only those unique tuples to the first
relation. It effectively filters out any tuples that are also present in the second relation.
5. Compatibility Requirement: Both relations involved in the SET DIFFERENCE operation must
be union-compatible, meaning they must have the same structure (same number of attributes
and corresponding data types).
Example 3
R, the result of P-Q, gives employees working only on package J1. Employees working on both
packages J1 and J2 have been removed.
R:
Id Name
101 Jones
107 Evan
112 Smith
Fig7. P-Q
The difference operation removes common tuples from the first relation.
The extended Cartesian, or simply the Cartesian product of two relations, is the concatenation of
tuples belonging to the two relations. A new resultant relation scheme is created consisting of all
possible combinations of the tuples.
R=PxQ
The degree of the result relation is given by: |R| = |P| + |Q|
Example 5
The Cartesian product of the PERSONNEL relation and SOFTWARE_PACKAGE relations of Figure 8(a)
is shown in Figure 8(b). Note that the relations P and Q from Figure 2 of Example 1 are a subset of the
PERSONNEL relation.
PERSONNEL:
Id Name
101 Jones
103 Smith
104 Lalonde
107 Evan
110 Drew
112 Smith
Fig 8. Software Packages:
S
J1
J2
Fig.8(a)
PERSONNEL:
Id P. Name S
101 Jones J1
101 Jones J2
103 Smith J1
103 Smith J2
104 Lalonde J1
104 Lalonde J2
106 Byron J1
106 Byron J2
107 Evan J1
107 Evan J2
110 Drew J1
110 Drew J2
112 Smith J1
112 Smith J2
Fig.8(b)
Fig 8: (a) PERSONNEL (Emp#, Name) and SOFTWARE_PACKAGE(S) represent employees and
software packages, respectively; (b) the Cartesianproduct of PERSONNEL and
SOFTWARE_PACKAGES.
The union and intersection operations are associative and commutative;therefore, given relations
R, S, T:
Derived operations enhance the expressiveness and efficiency of relational queries by allowing more
intuitive and straightforward formulations of complex data retrieval tasks. They are built on the
fundamental operations but provide additional functionality and flexibility, making it easier to handle
various database operations without repeatedly combining the basic operations.
Derived
Operations
4.1. Division()
The DIVISION operation is one of the derived operations in relational algebra. It retrieves tuples from
one relation that are associated with all tuples in another relation. It is particularly useful when
dealing with queries that require identifying entities related to a comprehensive set of values, such
as finding all students who have taken all the courses offered in a department.
The DIVISION operation, denoted by ÷, is helpful for a special query that sometimes occurs in
database applications.
Simply stated, the cartesian product of Q and R is a subset of P. In Figure 10(b), the result relation R
has four tuples; the cartesian product of R and Q gives a resulting relation, again a subset of P. In
Figure 10(c), since there are no tuples in P with a value b3 for the attribute B (i.e., selecting B=b3(P) =
0), we have an empty relation R, which has a cardinality of zero.
P: Q: R (result): Q: then R is:
Q: then R is:
Fig 10: Examples of the division operation. (a) R = P / Q; (b) R = P / Q(P is same in part i); (c)
R=P/Q (P is same as in part i); (d) R= P/Q (P is same as in part i)
In Figure 10(d), the relation Q is empty. The result relation can be definedas the projection of P on
the attributes in P-Q. However, it is usual to disallow division by an empty relation.
Finally, if relation P is an empty relation, then relation R is also an empty relation. Let us treat the Q
as representing one set of properties (the properties are defined on the Q, each tuple in Q
representing an instance of these properties) and the relation r as representing entities with these
properties (entities are defined on P-Q, and the properties are, as before, defined on Q); note that P
Q must be equal to P. Each tuple in P represents an object with some given property.
The resultant relation R is the set of entities that possesses all the properties specified in Q. The two
entities a1 and a5 possess all the properties, i.e., b1 and b2. The other entities in P, a2, a3, and a4, only
possess one, not both, of the properties. The divisionoperation is useful when a query involves the
phrase "for all objects having all the specified properties." Note that both P-Q and Q generally
representa set of attributes. It should be clear that Q is not a subset of P.
The intersection operation selects the common tuples from the two relations.
R = P Q where R = {t | t P t Q}
Example 4
The resultant relation P Q is the set of all employees working on both packages. (figure 8.2(c) of
example 2).
P Q= P-(P-Q)
It is, however, more convenient to write an expression with a single intersection operation than one
involving a pair of difference operations.
Note that in these examples, the operand and the result relation schemes, including the attribute
names, are identical, i.e. P=Q=R. If the attribute names of compatible relations are not identical, the
naming of the attributes of the result relation will have to be resolved.
As the name suggests, the join operator allows the combining of two relations to form a new one. The
tuples from the operand relations that participate in the operation and contribute to the result are
related. The join operation allows the processing of relationships existing between the operand
relations.
Example 6
In Figure 11, we encounter the following relations:
ASSIGNMENT (Emp#, Prod#, Job#)
EMPLOYEE PRODUCT:
Fig.11a)
JOB-FUNCTION ASSIGNMENT
Fig.11b)
Fig 11: (a) Relation schemes for employee roles in development teams, b) Sample relation
Suppose we want to respond to the query “Get product number of assignments whose development
teams have a chief programmer”. This requires first computing the Cartesian product of the
ASSIGNMENT and JOB_FUNCTION relations. Let us name this product relation TEMP. This is followed
by selecting those tuples of TEMP where the attribute Title has the value chief programmer and the
value of the attribute Job# in ASSIGNMENT and JOB_FUNCTION are the same. The required result,
shown below, is obtained by projecting these tuples to the attribute Prod#. The operations are
specified below.
Notice that in the selection operation that follows the Cartesian product, we take only those tuples
where the value of the attributes is ASSIGNMENT.Job#and JOB_FUNCTION.Job# are the same. These
combined operations of the cartesian product followed by selection are the join operations. Note that
we have qualified the identically named attributes by the name of the corresponding relation to
distinguish them.
If a relation is joined with itself, we would need to rename either the attributes of one of the copies of
the relation or the relation’s name itself. We illustrate this in example 7.
The join condition may have more than one term, necessitating using the subscript in the comparison
operator. Now, we shall define thedifferent types of join operations.
In these discussions, we use P, Q, R, and so on to represent both the relation scheme and the collection
or bag of underlying domains of the attributes. We call it a bag of domains because multiple attributes
may be defined on the same domain.
Typically, P С Q may be null, guaranteeing the uniqueness of attribute names in the result relation.
When the same attribute name occurs in the two schemes, we use qualified names.
Two common and useful variants of the join are the equi-join and the natural join. In the equi-join,
the comparison operator theta (i=1,2,...,n) is always the equality operator (=). Similarly, the
comparison operator is always the equality operator in the natural join. However, only one of the two
sets of domain-compatible attributes involved in the natural join is Ai from P and Bi from Q, for i =
1,...n, the natural join predicate is a conjunction of terms of the following form:
Domain compatibility requires that the domains of Ai and Bi be compatible, and for this reason,
relation schemes P and Q have attributes defined on common domains, i.e., P С Q¹. Therefore, join
attributes have common domains in the relation schemes P and Q. Consequently, only one set of the
join attributes on these domains must be preserved in theresult relation. This is achieved by taking
a projection after the join operation, eliminating duplicate attributes. If the relations P and Q have
attributes with the same domain but different attribute names, then renaming or projection may be
specified.
Example 7
Given the EMPLOYEE and SALARY relations of Figure 12(i), if we must find the salary of employees
by name, we join the tuples in the relation EMPLOYEE with those in SALARY such that the value of
the attribute ID in EMPLOYEE is the same as that in SALARY. The natural join takes the predicate
expression to be EMPLOYEE.Id = SALARY.Id. The result of the natural join is shown in Figure 12(ii).
We do not need to specify this predicate when using the natural join. The expression to specify the
operation of finding employees' salary by name is as follows. Here, we project the result of the natural
join operation on the attributes Name and Salary:
Id Name Salary
101 Jones 67
103 Smith 55
104 Lalonde 75
107 Evan 80
ASSIGNMENT.Emp# COASSIGN.Emp#
107 107
107 101
107 103
101 107
101 101
101 103
110 101
103 107
103 101
103 103
101 110
Fig 12: i) The natural join of EMPLOYEE and SALARY relations; ii) The joint of ASSIGNMENT
with the renamed copy
In relational algebra, joins are crucial operations that combine tuples from two relations based on
a specified condition. Here’s a rundown of the main types of joins:
1. Natural Join (⨝): This join pairs tuples from two relations that have the same value in all
attributes with common names. It automatically eliminates duplicate columns.
2. Theta Join (⨝_θ): This join combines tuples from two relations based on a specific condition
(θ) that involves attributes of the relations. It’s more flexible than a natural join because it
allows for conditions other than equality.
3. Equi Join: A special case of the theta join where the condition is strictly equality between
attributes. It's often used interchangeably with natural join when the context is clear.
4. Semi Join (⋉ or ⋊): This join returns tuples from the first relation for which there is at least
one matching tuple in the second relation, but the result includes only the attributes of the first
relation. It’s useful for queries where you only need to verify the existence of related tuples in
another relation without returning their values.
5. SUMMARY
Relational Algebra is a fundamental concept in database theory, providing the mathematical
framework for manipulating and querying data stored in relational databases. It consists of a set of
basic and derived operations that can be combined to perform complex queries. The basic operations
in Relational Algebra include Select, Project, Union, Set Difference, and Cartesian Product. These
operations are crucial for filtering data, selecting specific columns, combining datasets, and
generating all possible combinations of tuples from two relations.
The Select operation filters rows from a relation based on a given predicate, retrieving only those
tuples that satisfy the condition. This operation is crucial for narrowing down large datasets to more
relevant subsets of data. The Project operation, on the other hand, selects specific columns from a
relation, effectively reducing the number of attributes in the result set. Combining the Select and
Project operations allows users to create powerful queries that retrieve only the necessary data in a
specific format.
The Union operation in Relational Algebra combines the tuples of two relations, eliminating
duplicates and generating a relation that contains all tuples from both input relations. The Set
Difference operation finds tuples present in one relation but not in the other, making it useful for
identifying distinct elements. The Cartesian Product operation generates all possible pairs of tuples
from two relations, often used as an intermediate step in more complex queries, such as Joins.
In addition to the basic operations, Relational Algebra includes several derived operations that
provide more advanced data manipulation capabilities. The division operation finds tuples in one
relation that are associated with all tuples in another. The Intersection operation returns only the
tuples common to both input relations.
Finally, the Join operation is a powerful tool for combining related tuples from different relations
based on a specified condition, enabling complex queries across multiple tables. Together, these
operations form the backbone of query processing in relational databases, allowing users to extract
meaningful information from structured data.
6. GLOSSARY
An operation that returns all possible pairs of tuples from two relations,
Cartesian Product - forming a relation that combines each tuple from the first relation with
every tuple from the second.
Boolean A logical statement that can be evaluated as either true or false, often used
- in filtering conditions in queries and algorithms.
Expression
Arity of the Another term for the degree of a relation, indicating the number of
- attributes or columns in a relational table.
Relation
A type of join operation that combines tuples from two relations based on
Equi-Join - an equality condition between specified attributes.
7. SELF-ASSESSMENT QUESTIONS
SELF-ASSESSMENT QUESTIONS – 1
Fill in the blanks:
1 In Relational Algebra, the Basic operations are the traditional ________________ orations.
2 The Cartesian product can be defined on any __________________relations.
3 If we assume that P and Q are two union-compatible relations, then the union of P and Q is set-
theoretic __________________of P and Q.
4 The difference operation removes _________________tuples from the first relation.
5 The operation selects the common tuples from the two relations.
6 The Cartesian product of two relations is the ___________________of tuples belonging to the two
relations.
7 The union and intersection operations are associative and __________________.
8 The _________________operation, in general, is non-commutative and non- associative.
9 The projection of a relation yields a ______________________subset of the relation.
10 The projection operation may _________the cardinality of the result relation since duplicate
tuples are removed.
11 The selection operation yields a _____________________subset of a given relation.
12 The ______________operator allows the combining of two relations to form a single new relation.
13 The join operation allows the processing of ____________________existing between the operand
relations.
14 Two common and useful variants of the join are the equi-join and the _______________.
15 In the _______________ the comparison operator theta (i=1,2,...,n) is always the equality operator
(=).
16 In the natural join, the _______________ operator is always the equality operator.
8. TERMINAL QUESTIONS
1. Classify the types of operations in relational Algebra.
2. Explain the select operation.
3. Identify the characteristics of select operations in relational algebra.
4. List the Key Characteristics of the PROJECT Operation.
5. Explain the UNION operation with an example.
6. Define the SET DIFFERENCE operation.
7. What are the Key Characteristics of the SET DIFFERENCE Operation?
8. Discuss the CARTESIAN PRODUCT with an example.
9. Classify the derived operations in relational algebra.
10. Compare and contrast the Intersection vs. Join operations in relational algebra.
11. Given two relations R1 and R2, where R1 contains N1 tuples, R2 contains N2 tuples, and N2 > N1
> 0, give the minimum and maximum possible sizes (in tuples) for the result relation produced
by each of the following relational algebra expressions. In each case, state any assumptions
about the schemas for R1 and R2 that are needed to makethe expression meaningful:
R1 υ R2
R1 ∩ R2
R1− R2
R1 x R2
a=5 (R1)
a (R1)
R1/R2
12. Consider the following schema:
Suppliers(sid: integer, sname: string, address: string)
Parts(pid: integer, pname: string, color: string)
Catalog(sid: integer, pid: integer, cost: real)
The key fields are underlined, and the domain of each field is listed after the field name. Thus, sid is
the key for Suppliers, pid is the key for Parts, and sidand pid together form the key for the Catalog. The
Catalog relation lists the prices charged for parts by Suppliers. Write the following queries in
relational algebra.
9. ANSWERS
Derived Operations: These include operations like Division, Intersection, and Join. Derived
operations are combinations or extensions of basic operations used to perform more complex queries.
Refer to section 3 for details.
Answer 2: The Select operation in Relational Algebra is used to filter rows from a relation based on
a specific condition or predicate. The operation is denoted by the Greek letter sigma (σ) followed by
the predicate in parentheses. It retrieves all tuples from a relation that satisfies the given condition,
effectively narrowing down the dataset to only the relevant data.
Example: Suppose we have a relationship between students with attributes such as name, age, and
grade. If we want to select students who are older than 18, we will use the Select operation as follows:
σAge>18(Students)\sigma_{Age > 18}(\text{Students})σAge>18(Students) This operation would
return all tuples (rows) from the Students relation where the Age attribute is greater than 18. Refer
to section 3.1 for details.
Answer 5: The Union operation in Relational Algebra combines the tuples of two relations into a
single relation, eliminating any duplicate tuples. The two relations must be union-compatible,
meaning they must have the same number of attributes, and the corresponding attributes must have
the same domain.
Example: Suppose we have two relations, A and B, with the same attributes:
The Union operation would be: A∪BA \cup BA∪B Result: { (1, 'John'), (2, 'Alice'), (3, 'Bob') } Refer to
section 3.3 for details.
Answer 6: The Set Difference operation in Relational Algebra returns the tuples that are present in
one relation but not in another. The operation is denoted by the minus sign (-). Like the Union
operation, the two relations involved must be union-compatible.
The Set Difference operation A−BA - BA−B would result in Result: { (1, 'John') } Refer to section 3.4
for details.
Answer 7: The Set Difference operation has the following key characteristics:
• Binary Operation: Requires two relations as input.
• Union Compatibility: Both input relations must be union-compatible.
• Result Subtraction: Returns tuples from the first relation that are not present in the second
relation.
• Non-Destructive: Does not modify the original relations but produces a new relation.
• Asymmetric: The operation A−BA - BA−B is not the same as B−AB - AB−A. Refer to section 3.4
for details.
Answer 8: The Cartesian Product operation, denoted by a cross (×), combines two relations by
pairing every tuple from the first relation with every tuple from the second relation. The result is a
relation that contains all possible combinations of tuples from the two input relations.
Example: Suppose we have two relations, R and S:
The Cartesian Product R×SR \times SR×S would result in: Result: { (1, 'John', 'X', 'New York'), (1, 'John',
'Y', 'Los Angeles'), (2, 'Alice', 'X', 'New York'), (2, 'Alice', 'Y', 'Los Angeles') } Refer to section 3.5 for
detail.
• Intersection: Returns only the tuples that are common to both input relations.
• Join: Combines related tuples from different relations based on a specified condition. Refer to
section 4 for details.
Join: This operation combines tuples from two relations based on a specific condition. It is more
versatile than Intersection, as it can link tuples from different relations using a key or a condition.
There are various types of Joins, such as Inner Join, Outer Join, and Natural Join, each serving different
needs in query processing. Refer to section 4.2 for details.
10. REFERENCES
1. Fundamentals of Database Systems, Ramez Elmasri and Shamkant B. Navathe, 7th edition,
Pearson, 2016, ISBN-10: 0133970779.
2. Database Management Systems, Raghu Ramakrishnan and Johannes Gehrke,3rd Edition,
McGraw-Hill Education,2002 (for the 3rd Edition), ISBN-13: 978-0072465631
3. Database System Concepts, Abraham Silberschatz, Henry F. Korth, S. Sudarshan,7th Edition,
McGraw-Hill Education, 2019, ISBN-10: 0073523321
DCA1208
DATABASE MANAGEMENT SYSTEM
Unit: 8 - Introduction to SQL 1
DCA1208 : Database Management System
Unit - 8
Introduction to SQL
DCA324
KNOWLEDGE MANAGEMENT
Unit: 8 - Introduction to SQL 2
DCA1208 : Database Management System
TABLE OF CONTENTS
1 Introduction - -
4-5
1.1 Objectives - -
2 SQL Overview - -
2.1 Definition - -
6-8
2.2 Process 1 -
2.3 Features - -
7 Summary - - 24-25
8 Glossary - - 26
10 Terminal Questions - - 31
11 Answers - - 32-35
12 References - - 36
1. INTRODUCTION
In the previous unit, we studied the Basic Operations of Relational Algebra, which forms the
foundation for manipulating relational database data. We saw how the Select operation works for
filtering rows based on conditions, the Project operation for retrieving specific columns, the Union
operation for combining results from two relations, the Set Difference operation for identifying rows
present in one relation but not the other, and the Cartesian Product for pairing rows from two
relations.
Additionally, we looked into the Derived Operations of Relational Algebra, including Division for
queries involving "all" conditions, Intersection for finding standard rows between relations, and Join
for combining related tuples from two relations based on a condition, thereby providing a robust
toolkit for database query formulation.
In the current unit, we will begin by exploring an SQL Overview and its definition as the standard
language for managing and interacting with relational databases. We'll understand the SQL process,
which involves writing, executing, and retrieving results from queries on database servers.
Additionally, we'll examine the key features of SQL, such as its simplicity, flexibility, and compatibility
across different database systems. Moving on to SQL Syntax, we'll study the rules and structure of
writing SQL queries, including using keywords, clauses, operators, and statements to perform various
database operations effectively.
We will also classify and explore the Categories of SQL Commands, which include Data Definition
Language (DDL) for defining database structures, Data Manipulation Language (DML) for modifying
data, Data Control Language (DCL) for managing permissions, and Transaction Control Language
(TCL) for handling transactions. Furthermore, we will study Data Types to understand how SQL
defines the kind of data stored in database tables, such as integers, strings, and dates. Finally, we will
analyse the Basic SQL Commands within these categories, examining their syntax and practical usage.
Topics include DDL commands like CREATE and ALTER, DML commands like INSERT and UPDATE,
DCL commands like GRANT and REVOKE, and TCL commands like COMMIT and ROLLBACK,
providing a comprehensive understanding of database management.
1.1 Objectives
After studying this unit, you should be able to:
• Define SQL
• Describe the SQL Process
• List the Features of SQL
• Explain the SQL syntaxes
• Classify the Categories of SQL Commands
2.1 Definition
SQL (Structured Query Language) is a standardised programming language for managing and
manipulating relational databases. In a Database Management System (DBMS), SQL serves as the
primary language for Data Retrieval, Manipulation, Definition, and Control.
SQL is widely used in DBMSs due to its high level of abstraction, making it easy to use and understand.
It allows users to perform complex queries, manage data relationships, and enforce data integrity
with a simple syntax.
SQL Query: The process starts with an SQL query written by the user, which could be for data
retrieval (e.g., SELECT), updating, or modifying the database.
Query Language Processor: The SQL query is sent to the Query Language Processor. This
component translates the high-level SQL query into an intermediate representation that the DBMS
can understand.
Parser + Optimiser: Within the Query Language Processor, the Parser checks the SQL query syntax
to ensure it is correct. After parsing, the Optimiser takes the query and determines the most efficient
way to execute it, considering factors like available indexes, data distribution, and join methods.
DBMS Engine: The DBMS Engine is the core component executing the query. It coordinates with the
File Manager and Transaction Manager to ensure that the query is executed efficiently and that any
changes are appropriately managed.
File Manager + Transaction Manager: These components work alongside the DBMS Engine:
• File Manager handles the actual storage of data on disk and manages data retrieval.
• Transaction Manager ensures data integrity, handling concurrency, and maintaining
transactions' ACID (Atomicity, Consistency, Isolation, Durability) properties.
Physical Database: This is where the data is physically stored. With the help of the File and
Transaction Managers, the DBMS Engine interacts with the Physical Database to retrieve, update, or
delete data as per the SQL query.
• Data Integrity: SQL supports integrity constraints (PRIMARY KEY, FOREIGN KEY, and UNIQUE)
to maintain data accuracy and consistency.
3. SQL SYNTAX
SQL (Structured Query Language) syntax is the rules and guidelines for writing commands that
interact with relational databases. SQL syntax follows a specific format to define and manipulate data
and the database structure.
1. Keywords: SQL commands comprise keywords and reserved words in SQL. Examples include
SELECT, FROM, WHERE, INSERT, UPDATE, DELETE, CREATE, and DROP. These keywords are
case-insensitive, meaning SELECT is the same as select.
Example:
2. Clauses: SQL statements are often divided into clauses, such as SELECT, FROM, WHERE, GROUP
BY, ORDER BY, and HAVING. Each clause has a specific role in a query.
Example:
3. Operators: SQL operators include comparison operators (e.g., =, >, <, >=, <=, !=), logical
operators (e.g., AND, OR, NOT), and special operators (e.g., IN, BETWEEN, LIKE). They are used
in WHERE clauses to filter results.
Example:
4. Statements: SQL statements are complete instructions that operate on the database. They end
with a semicolon (;). Common statements include SELECT (retrieves data), INSERT (adds data),
UPDATE (modifies data), DELETE (removes data), and DDL statements like CREATE TABLE or
DROP TABLE.
Example:
5. Functions: SQL includes various functions for performing calculations on data, such as
aggregate functions (e.g., SUM(), AVG(), MAX(), MIN(), COUNT()) and string functions (e.g.,
UPPER(), LOWER(), CONCAT()).
• Comments: Comments can be added to SQL code to increase readability. They are ignored
during execution. In SQL, comments can be single-line (--) or multi-line (/* ... */).
SQL commands are organised into several categories based on their specific functions in managing
and manipulating data within a database.
Categories of SQL
Commands
a. Data Definition Language (DDL) is used to define and modify the structure of a database. It
includes commands such as CREATE to create new database objects like tables, views, or
indexes; ALTER to change the structure of existing objects; and DROP to remove objects from
the database. These commands work on the schema level, meaning they affect the organisation
and metadata of the database. DDL commands are auto-committed, meaning their effects are
permanent and cannot be rolled back. This makes them critical for setting up and managing the
backbone of the database system.
b. Data Manipulation Language (DML) manages and manipulates the data stored within
database tables. Key commands include INSERT to add new records, UPDATE to modify existing
data and DELETE to remove records. DML commands work on the data level and directly impact
the rows in a table without changing the table's structure. Unlike DDL, DML changes are not
auto-committed, allowing users to roll back changes and ensuring flexibility and control over
data handling.
c. Data Control Language (DCL) focuses on managing access to the database and ensuring data
security. The primary commands are GRANT, which provides specific privileges to users, and
REVOKE, which removes those privileges. DCL is essential for controlling who can perform
Unit: 8 - Introduction to SQL 11
DCA1208 : Database Management System
actions like querying, inserting, or modifying data. These commands are vital in implementing
role-based access control, ensuring only authorised users can interact with sensitive data and
maintaining database integrity and security.
d. Transaction Control Language (TCL) ensures the consistency and integrity of data by
managing database transactions. The COMMIT command saves all the changes made during a
transaction, making them permanent. ROLLBACK undoes the changes in the current transaction,
reverting the database to its previous state. SAVEPOINT allows the creation of checkpoints
within a transaction, enabling partial rollbacks to specific points. TCL commands are crucial for
maintaining data reliability, particularly in environments where multiple transactions occur
simultaneously.
Each category uniquely contributes to a well-rounded approach to managing databases and ensuring
data security, integrity, and accessibility within SQL-based systems.
To construct and administer the database, there are two significant DDL statements – CREATE and
DROP, which form the backbone of many commands:
CREATE DATABASE to create a database DROP DATABASE to remove a database CREATE TABLE to
create a table DROP TABLE to drop a table CREATE INDEX to create an index on a column DROP
INDEX to drop an index CREATE VIEW to create a view DROP VIEW to drop a view.
Some additional ones, such as ALTER TABLE or MODIFY DATABASE, may be vendor-specific.
To manipulate data in tables directly or through views, we use the four standard DML statements:
o SELECT
o DELETE
o INSERT
o UPDATE
These statements and their functionality are now universally accepted, although the degree to which
these commands support this functionality varies somewhat between products. Compare the
functionality of different implementations of UPDATE, for example.
• Data Control
Concurrency is concerned with how multiple users operate upon the database.
Each user can reflect a transaction's updates using COMMIT or cancel all of them using ROLLBACK.
b) Security
Security has two aspects. The first is the VIEW mechanism. A view of a relation can be created, which
hides sensitive information and defines only the part of the relation that should be visible. A user can
then access this view.
The second is by using the GRANT operation. This grants one or moreaccess rights to perform the
data manipulative operations on the relations.
c) Integrity Constraints
While DCL does not define integrity constraints, it plays a crucial role in enforcing security policies
that indirectly support data integrity. For instance, by granting or revoking permissions, a database
administrator can control which users can insert, update, or delete data, thereby helping maintain the
data's overall integrity.
For example, one can specify that an attribute of a relation will not take onnull values.
In SQL, data types define the data that can be stored in each table column. They help ensure that data
is stored efficiently and organised. These are the basic data types used in SQL. Choosing the
appropriate data type for each column ensures efficient storage and data integrity in the database.
1. Numeric Data Types: These are used for storing numbers, both integers and decimals.
o INT: Stores whole numbers (integers) without decimal points. Commonly used for age,
quantity, etc.
o DECIMAL (p, s): Used for fixed-point numbers, where p is the precision (total number
of digits) and s is the scale (number of digits after the decimal).
2. Character Data Types: These store strings, including text and alphanumeric characters.
o CHAR(n): Fixed-length character type, where n specifies the length. If the entered
string is shorter than n, it’s padded with spaces.
o TEXT: Used for large strings, typically for paragraphs or blocks of text.
3. Date and Time Data Types: These store dates, times, and timestamps.
o DATETIME: Stores both date and time in the format YYYY-MM-DD HH:MM: SS.
o TIMESTAMP: Like DATETIME but also records the time zone or server’s time.
DDL commands are crucial for creating, organising, and maintaining a database's structure. They help
set up the foundational layout, define relationships between tables, and ensure data storage
alignment with the application's requirements.
DDL
Commands
1. CREATE: This command creates new database objects like tables, indexes, or views.
• Example (Creating a Table):
This command creates a table named Employees with columns for EmployeeID, FirstName, LastName,
Department, and Salary.
2. ALTER: The ALTER command modifies the structure of an existing database object, such as
adding, deleting, or modifying columns in a table.
• Example (Adding a Column):
This modifies the Salary column in the Employees table to increase the precision to 12 digits.
The ALTER command in SQL is used to modify the structure of an existing database table. If
you want to delete a column from a table using SQL, use the ALTER TABLE command followed
by the DROP COLUMN statement.
Here is an example of how to use the ALTER TABLE command to delete a column:
For instance, if you have a table named Employees and you want to delete a column named
Age, you would write:
3. DROP: This command deletes database objects like tables, indexes, or views. Note that
dropping a table permanently removes all the associated data and structure.
• Example (Dropping a Table):
This command deletes the Employees table along with all its data.
Data Definition Language (DDL) is crucial in database management systems (DBMS) as it defines,
modifies, and manages the structure of database objects, including tables, indexes, views, and
schemas. Here’s why DDL plays an essential role in DBMS:
o Enforces Data Integrity: DDL commands allow the application of constraints (e.g., primary
key, foreign key, unique) on tables to enforce data integrity and consistency. This ensures that
only valid and accurate data enters the database, reducing the likelihood of data anomalies and
errors, particularly in large or critical systems.
o Facilitates Database Maintenance: With DDL, database administrators can modify the
schema over time to adapt to evolving requirements. ALTER and DROP commands make it
possible to manage changes without disturbing the entire database, enabling smooth upgrades
and maintenance of the database structure.
o Enhances Query Performance: Creating indexes through DDL improves data retrieval speed,
which can significantly enhance the performance of database queries. Well-designed indexes
allow faster data access, making DDL essential for optimising database performance, especially
when dealing with large datasets.
o Supports Schema Evolution: DDL enables database schema evolution, allowing structures to
change as business needs grow. Adding new tables, columns, or constraints and adjusting data
relationships over time ensures that the database remains relevant and adaptable without
requiring a complete rebuild.
o Auto-Commit Nature for Security: DDL commands are auto-committed, meaning changes
are immediately saved to the database. This is particularly important for ensuring the database
structure is secure and stable, as schema modifications cannot be rolled back, providing a
controlled environment for structural changes.
DML
Commands
DCL
Commands
GRANT REVOKE
1. GRANT Command:
o The GRANT command provides specific privileges to a user or role. These privileges
include SELECT, INSERT, UPDATE, DELETE, or other database actions.
Syntax:
Example:
This example grants the user "John" the ability to SELECT and INSERT records in the "Employees"
table.
2. REVOKE Command:
• The REVOKE command is used to withdraw previously granted privileges from a user or role.
Syntax:
Example:
This example revokes the INSERT privilege on the "Employees" table from the user "John."
Importance of DCL:
• Access Control: Ensures that only authorised users can perform specific operations, such as
reading or modifying sensitive data.
• Data Security: Protects sensitive information from unauthorised access and potential
breaches.
• Role-Based Permissions: Allows the creation of roles with specific privileges, simplifying
user management in large systems.
• Auditing and Compliance: This helps track who has access to what, making it easier to
comply with data protection regulations.
• Allow managers to update salary details only while enabling employees to view their records.
TCL
Commands
1. COMMIT:
• The COMMIT command permanently saves all the changes made during the current
transaction in the database. Once a COMMIT is executed, the changes cannot be undone,
making it essential to ensure data correctness before committing.
Syntax:
Example:
2. ROLLBACK:
• The ROLLBACK command undoes all changes made during the current transaction, reverting
the database to its previous state. It is used in cases where errors occur or the user decides not
to save the changes made during the transaction.
Syntax:
Example:
3. SAVEPOINT:
• The SAVEPOINT command is used to create intermediate points within a transaction. This
allows rolling back to a specific savepoint instead of undoing the entire transaction.
Syntax:
Example:
In this example, changes made to the "Marketing" department are preserved, while changes to the
"IT" department are undone.
Transaction Control Language (TCL) commands are critical in managing database transactions and
ensuring data integrity, consistency, and reliability in a Database Management System (DBMS).
• Ensures Data Consistency: TCL commands like COMMIT and ROLLBACK help ensure that
the database remains consistent by saving valid changes or reverting to a previous stable state
in case of errors. This is crucial for maintaining accurate and reliable data.
• Supports Atomicity in Transactions: TCL enforces the "all-or-nothing" principle of
transactions. All operations within a transaction are completed successfully (COMMIT), or
none are applied (ROLLBACK), thereby supporting the atomicity property of ACID compliance.
• Facilitates Error Recovery: TCL commands enable users to recover from errors by rolling
back incomplete or incorrect changes made during a transaction. This ensures the database is
not left unstable or partially updated after an error.
• Enables Fine-Grained Control with SAVEPOINT: The SAVEPOINT command allows the
creation of intermediate checkpoints within a transaction. Users can roll back to specific
savepoints rather than undoing the entire transaction, providing greater flexibility in
managing complex database operations.
• Improves Data Reliability in Concurrent Environments: In multi-user environments, TCL
commands help manage simultaneous transactions, preventing issues like data corruption or
conflicts. They ensure that operations are executed in a controlled manner, maintaining
database reliability.
7. SUMMARY
• SQL (Structured Query Language) is a standard programming language used to interact with
relational databases. It enables users to define, manipulate, query, and control database data.
SQL typically involves writing queries that retrieve or modify data from database tables, with
the database engine interpreting and executing these queries. SQL's versatility and ease of use
make it essential for database management and data-driven applications. Key features of SQL
include its standardisation, ability to handle large volumes of data, support for complex
queries, and compatibility with various database systems.
• SQL syntax provides the structural rules for writing SQL queries. Commands are typically
written in clauses such as SELECT, FROM, WHERE, and ORDER BY, with a semicolon (;) used
to terminate each query. Keywords are case-insensitive, and operators like =, <, and > help
filter or compare data. Functions and aggregate operators (e.g., SUM(), AVG()) are integral to
SQL syntax for performing calculations and analysing data. Comments can also be added to
increase query readability. Properly understanding SQL syntax is crucial for writing accurate
and efficient database queries.
• SQL commands are divided into four main categories: Data Definition Language (DDL), Data
Manipulation Language (DML), Data Control Language (DCL), and Transaction Control
Language (TCL). DDL defines and manages database structure using commands like CREATE,
ALTER, and DROP. DML handles data within tables using commands like INSERT, UPDATE,
and DELETE. DCL controls permissions and user access with GRANT and REVOKE, while TCL
ensures transaction integrity and data consistency using commands like COMMIT, ROLLBACK,
and SAVEPOINT.
• Data types in SQL define the nature of data that can be stored in a column, ensuring consistency
and accuracy. Common data types include INT for integers, VARCHAR for variable-length
strings, DATE for date values, and FLOAT for decimals. Each database management system
may have variations or additional data types. Choosing the correct data type is essential for
optimising storage and ensuring efficient data processing, especially in large databases.
• SQL commands are the building blocks of database operations, categorised based on their
functionality. DDL commands define the structure and schema, such as creating tables. DML
commands focus on data manipulation, enabling users to add, update, or delete records. DCL
commands manage user permissions to safeguard database security. TCL commands ensure
transaction consistency and data reliability, allowing users to save or revert changes during a
transaction. Understanding and applying these commands is critical for efficiently managing
databases and ensuring data integrity.
8. GLOSSARY
Query Language A component of DBMS that interprets and translates SQL queries into
-
Processor low-level instructions for the database engine to execute.
Data Storage Organising data in physical storage to optimise space utilisation and
-
Alignment improve retrieval efficiency.
9. SELF-ASSESSMENT QUESTIONS
11. ANSWERS
Self-Assessment Questions
1. (b) Managing and manipulating relational databases
2. (c) Parser
3. (c) Concurrency and maintaining ACID properties
4. (a) Managing query optimisation
5. (b) Transaction Control
6. (b) GRANT and REVOKE
7. (b) To filter results based on specific conditions
8. (c) CONCAT
9. (b) Data Definition Language (DDL)
10. (b) Revert the database to its previous state within a transaction
11. (c) CREATE TABLE
12. (c) Reflects the updates of a transaction permanently
13. (a) CHAR(8)
14. (c) FLOAT stores numbers with floating-point precision, whereas DECIMAL stores numbers
with fixed-point precision.
15. (c) DDL is used to define, modify, and manage the structure of database objects.
16. (c) DROP
17. (b) GRANT
18. (c) To withdraw previously granted privileges
19. (c) The COMMIT command permanently saves all changes made during the current
transaction.
20. (c) To create intermediate checkpoints within a transaction for partial rollbacks.
Answer 2: The SQL process involves writing SQL queries that are sent to a database server. The
server interprets and executes the queries and returns the results to the client application. Refer to
section 2.2 for more details.
Answer 3: SQL is simple and user-friendly, supports data manipulation and definition, provides data
integrity and security, and is platform-independent, making it versatile for database management.
Refer to section 2.3 for more details.
Answer 4:
• Keywords: Reserved words used to perform specific tasks (e.g., SELECT, INSERT).
• Clauses: Components of statements (e.g., WHERE, GROUP BY).
• Operators: Symbols for comparisons (e.g., =, <, >).
• Statements: Full commands (e.g., SELECT column_name FROM table_name). Refer to section
2.4 for more details.
Answer 5:
• Transaction Control Language (TCL): COMMIT, ROLLBACK. Refer to section 4 for more
details.
Answer 6: Common data types include INT, VARCHAR, CHAR, DATE, FLOAT, and BOOLEAN. They
define the kind of data that can be stored in a column. Refer to section 5 for more details.
Answer 7:
Example: ALTER TABLE Employee ADD Salary FLOAT; Refer to section 6 for more details.
Answer 8: DDL commands define and modify the structure of database objects, ensure data integrity
through constraints, create relationships between tables, and manage schema effectively. Refer to
section 6 for more details.
• DELETE: Removes data from tables. Refer to section 6 for more details.
Answer 10:
Answer 11: DCL commands manage user access and privileges, ensure data security, restrict
unauthorised data manipulation, and define roles for multi-user environments. Refer to section 6
for more details.
Answer 12: TCL commands manage transactions in a database, ensuring atomicity, consistency, and
reliability. They include COMMIT, ROLLBACK, and SAVEPOINT. Refer to section 6 for more details.
Answer 13: COMMIT: Saves all changes permanently. Example: UPDATE Employee SET Salary = 5000
WHERE ID = 1; COMMIT;
Answer 14: TCL ensures data integrity and consistency, supports error recovery, enforces atomicity,
and manages concurrent transactions in multi-user environments. Refer to section 6 for more
details.
Answer 15:
12. To Calculate Relative Frequency, count occurrences of each colour (Red, Blue, Green, Yellow)
12. REFERENCES
and calculate the percentage of each colour relative to the total number of responses. You can derive
• Fundamentals of Database Systems, Ramez Elmasri and Shamkant B. Navathe, 7th edition,
Pearson, 2016, ISBN-10: 0133970779.
• Database Management Systems, Raghu Ramakrishnan and Johannes Gehrke,3rd Edition,
McGraw-Hill Education,2002 (for the 3rd Edition), ISBN-13: 978-0072465631
• Database System Concepts, Abraham Silberschatz, Henry F. Korth, S.Sudarshan,7th Edition,
McGraw-Hill Education, 2019, ISBN-10: 0073523321