0% found this document useful (0 votes)
2K views50 pages

DBMS Material 2023

The document outlines the course material for a Database Management System course at SRI Vidya Mandir Arts & Science College, detailing the syllabus and key concepts covered in various units. Topics include database concepts, design concepts, normalization, SQL, PL/SQL, and the advantages of using a Database Management System (DBMS). It emphasizes the importance of databases in data organization, integrity, security, and decision-making for businesses and organizations.

Uploaded by

K. Pycho
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2K views50 pages

DBMS Material 2023

The document outlines the course material for a Database Management System course at SRI Vidya Mandir Arts & Science College, detailing the syllabus and key concepts covered in various units. Topics include database concepts, design concepts, normalization, SQL, PL/SQL, and the advantages of using a Database Management System (DBMS). It emphasizes the importance of databases in data organization, integrity, security, and decision-making for businesses and organizations.

Uploaded by

K. Pycho
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

SRI VIDYA MANDIR ARTS & SCIENCE COLLEGE

(AUTONOMOUS)
Katteri – 636 902, Uthangarai, Krishnagiri District
(An Autonomous College Affiliated to Periyar University, Salem)
(Recognized under Status 2(f) & 12(B) of the UGC Act 1956)
Accredited by NAAC with ‘A’ Grade [3.27]

PG and Research Department of Computer Science and Applications

III & II BSc CS, BCA


Course Material on

Course Name: DATABASE MANAGEMENT SYSTEM


Course Code: 23UCS5C06/23UCA5C06
Syllabus

UNIT – I
Database Concepts: Database Systems - Data vs Information - Introducing the database -File
system - Problems with file system – Database systems. Data models - Importance - Basic
Building Blocks - Business rules - Evolution of Data models - Degrees of Data Abstraction.
UNIT – II
Design Concepts: Relational database model - logical view of data-keys -Integrity rules -
relational set operators - data dictionary and the system catalog - relationships -data redundancy
revisited -indexes - codd's rules. Entity relationship model - ER diagram.
UNIT – III
Normalization of Database Tables: Database tables and Normalization – The Need for
Normalization –The Normalization Process – Higher level Normal Form. Introduction to SQL:
Data Definition Commands – Data Manipulation Commands – SELECT Queries – Additional
Data Definition Commands – Additional SELECT Query Keywords – Joining Database
Tables.
UNIT – IV
Advanced SQL:Relational SET Operators: UNION – UNION ALL – INTERSECT -
MINUS.SQL Join Operators: Cross Join – Natural Join – Join USING Clause – JOIN ON
Clause – Outer Join.Sub Queries and Correlated Queries: WHERE – IN – HAVING – ANY
and ALL – FROM. SQL Functions: Date and Time Function – Numeric Function – String
Function – Conversion Function
UNIT – V
PL/SQL:A Programming Language: History – Fundamentals – Block Structure – Comments –
Data Types – Other Data Types – Variable Declaration – Assignment operation –Arithmetic
operators.Control Structures and Embedded SQL: Control Structures – Nested Blocks – SQL
in PL/SQL – Data Manipulation – Transaction Control statements. PL/SQL Cursors and
Exceptions: Cursors – Implicit Cursors, Explicit Cursors and Attributes – Cursor FOR loops –
SELECT…FOR UPDATE – WHERE CURRENT OF clause – Cursor with Parameters –
Cursor Variables – Exceptions – Types of Exceptions.
Text Books
1.Coronel, Morris, Rob, "Database Systems, Design, Implementation and Management", Ninth
Edition.
2. Nilesh Shah, "Database Systems Using Oracle", 2nd edition, Pearson Education India, 2016
UNIT – I

2
DATABASE CONCEPTS
1. Database Systems
A database system is a collection of interrelated files and a set of programs that allow
users to access and modify these files.
The need for efficiency has led designers to use complex data structures to represent
data in the database. The low-level physical view of data is too detailed for most users.
A major purpose of a database system is to provide users with an abstract or logical
view of the data. That is, the system hides certain details of how the data are stored and
maintained.
Underlying the structure of a database is the data model: a collection of conceptual tools
for describing data, data relationships, data semantics, and consistency constraints.
Key characteristics of Database Systems
 Ease of Use
 Less Duplication
 Multiple Layouts and Presentations
 Limited Redundancy
 Data Recovery and Backup Plan
 Data Security
 Reduces Storage Space
 Maintaining Integrity
1.1 Why Databases?
1.1.1 Database
A database is an organized collection of structured information, or data, typically stored
electronically in a computer system.
It's designed to hold and manage large amounts of data efficiently, allowing multiple
users to access and query it simultaneously.
Databases are crucial for efficient data storage, organization, and retrieval, especially
for large and complex datasets.
They provide a structured way to manage data, making it easier to access, modify, and
analyze information.
This is vital for businesses and organizations that rely on data for operations, decision-
making, and analytics.
In essence, databases are the backbone of modern data management, providing the
infrastructure needed to store, organize, protect, and utilize data effectively.
1.1.2 Why databases are important

3
 Data Organization and Retrieval
Databases offer a structured way to organize and manage data, making it easier to find
and retrieve specific information.
This structure allows for efficient queries and analysis, even with vast amounts of data.
 Data Integrity and Consistency
Databases often have built-in rules and validation mechanisms to ensure data accuracy
and consistency.
This helps prevent errors and inconsistencies, maintaining the reliability of stored
information.
 Scalability and Efficiency
Databases are designed to handle large volumes of data, scaling to millions or even
billions of records.
They provide efficient ways to manage data growth and user demands without
sacrificing performance.
 Data Security and Privacy
Databases offer robust security features like user authentication, access control, and
encryption to protect sensitive data.
This is crucial for complying with privacy regulations and protecting information from
unauthorized access.
 Data Analytics and Decision-Making
Modern databases support analytics tools to identify patterns, trends, and insights from
data.
This data-driven approach enables organizations to make informed decisions and
improve business outcomes.
 Centralized Data Management
Databases provide a centralized location for storing and managing data, reducing
redundancy and improving consistency.
This simplifies data sharing and collaboration among different departments and
applications.
 Improved Efficiency and Productivity
Databases streamline data management processes, freeing up resources for more
strategic tasks.
They also enable faster data retrieval and analysis, leading to more efficient decision-
making.
 Compliance and Reporting

4
Databases help organizations meet regulatory requirements and generate reports based
on stored data.
They provide a reliable source of information for auditing and compliance purposes.
1.2 Data vs Information
Data and Information are important concepts in the world of computing and decision-
making.
1.2.1 Data
Data is the raw material that can be processed for any computing machine.
Data is a raw and unorganized fact that is required to be processed to make it
meaningful. It can be considered as facts and statistics collected together for reference or
analysis.
Data or raw input is necessary to recognize trends and patterns and facilitate decision
making.
Data is defined as unstructured information such as text, observations, images, symbols,
and descriptions.
For example − Employee name, Product name, Name of the student, Marks of the
student, Mobile number, Image etc.
1.2.2 Information
` Information is the data that has been converted into more useful or intelligent form.
Information is defined as structured, organized, and processed data. It gives context to
the facts and facilitates decision-making.
For example: Report card sheet.
The information is needed for the following reasons
 To gain knowledge about the surroundings.
 To keep the system up to date.
 To know about the rules and regulations of the society.
1.2.3 Knowledge
The body of information and facts about a specific subject.
Knowledge implies familiarity, awareness, and understanding of information as it
applies to an environment.
A key characteristic is that new knowledge can be derived from old knowledge
Knowledge results from combining expertise and insights to produce a more profound
understanding.
Data Example
 Temperature Readings such as “72°F”, “68°F”, “75°F”.
 Student Grades like “85”, “92”, “78”.

5
 Stock Prices such as “$50.25”, “$48.90”, “$52.10”.
Information Example
 Weather Report: Based on the temperature reading a weather report can be generated.
 Grade Average: Based on the student grades, the average grade of class can be derived.
 Market Analysis: The stock market showed a slight increase today derived from stock
prices.
1.2.4 Differences
The major differences between Data and Information are as follows
Data Information
Data is the raw fact. It is a processed form of data.
It is not significant to a business. It is significant to a business.
It is unorganized It is organized data.
This is the primary level of intelligence. It is a secondary level of intelligence.
May or may not be meaningful. Always meaningful.
Understanding is difficult. Understanding is easy.
Data does not depend on information. Information depends on data.
Data is not sufficient for decision Information is sufficient for decision
making. making.

Some key points


 Data constitutes the building blocks of information.
 Information is produced by processing data.
 Information is used to reveal the meaning of data.
 Accurate, relevant, and timely information is the key to good decision making.
 Good decision making is the key to organizational survival in a global environment.
Data management is a discipline that focuses on the proper generation, storage, and
retrieval of data.
1.3 Introducing the database
A database is a shared, integrated computer structure that stores a collection of related
data.
A database contains two types of data
 End-user data—that is, raw facts of interest to the end user
 Metadata, or data about data, through which the end-user data is integrated and
managed.
The metadata describes the data characteristics and the set of relationships that links the
data found within the database.

6
For example, the metadata component stores information such as the name of each data
element, the type of values (numeric, dates, or text) stored on each data element, and whether
the data element can be left empty.
In short, metadata presents a more complete picture of the data in the database.
A database management system (DBMS) is a collection of programs that manages
the database structure and controls access to the data stored in the database.
1.3.1 Role and Advantages of the DBMS
The DBMS serves as the intermediary between the user and the database. The database
structure itself is stored as a collection of files, and the only way to access the data in those
files is through the DBMS.
The DBMS manages the interaction between the End User and the Database.

The DBMS receives all application requests and translates them into the complex
operations required to fulfill those requests.
The DBMS hides much of the database’s internal complexity from the application
programs and users.
The application program might be written by a programmer using a programming
language, such as Visual Basic. NET, Java, or C#, or it might be created through a DBMS
utility program.
Having a DBMS between the end user’s applications and the database offers some
important advantages.
 The DBMS enables the data in the database to be shared among multiple
applications or users.
 The DBMS integrates the many different users’ views of the data into a single
all-encompassing data repository.

7
 The DBMS helps make data management more efficient and effective.
1.3.2 Advantages of DBMS
 Improved data sharing: The DBMS helps create an environment in which end users
have better access to more and better-managed data.
 Improved data security: A DBMS provides a framework for better enforcement of
data privacy and security policies.
 Better data integration: Wider access to well-managed data promotes an integrated
view of the organization’s operations.
 Minimized data inconsistency: Data inconsistency exists when different versions of
the same data appear in different places. The probability of data inconsistency is greatly
reduced in a properly designed database.
 Improved data access: The DBMS makes it possible to produce quick answers to ad
hoc queries.
From a database perspective, a query is a specific request issued to the DBMS for data
manipulation.
 Improved decision making: Better-managed data and improved data access make
it possible to generate better-quality information, on which better decisions are made.
 Increased end-user productivity: The availability of data empowers end users to make
quick, informed decisions that can make the difference between success and failure in
the global economy.
1.3.3 Types of Databases
A DBMS can be used to build many different types of databases. Each database stores
a particular collection of data and is used for a specific purpose.
Databases can be classified by the number of users supported, where the data is located,
the type of data stored, the intended data usage, and the degree to which the data is structured.
 Classification of Database based on the number of users
Single-user database
A single-user database supports only one user at a time. A single-user database that runs
on a personal computer is called a desktop database.
Multiuser database
A multiuser database supports multiple users at the same time. When the multiuser
database supports a relatively small number of users or a specific department within an
organization, it is called a workgroup database.
Enterprise database
When the data base is used by the entire organization and supports many users across
many departments, the database is known as an enterprise database.

8
 Classification of Database based on the Location
Centralized database
A database that supports data located at a single site is called a centralized database.
Distributed database
A database that supports data distributed across several different sites is called a
distributed database.
Both centralized and decentralized (distributed) databases require a well-defined
infrastructure (hardware, operating systems, network technologies, etc.) to implement and
operate the database.
Cloud database
A cloud database is a database that is created and maintained using cloud data services,
such as Microsoft Azure or Amazon AWS.
These services, provided by third-party vendors, provide defined performance
measures (data storage capacity, throughput, and availability) for the database, but do not
necessarily specify the underlying infrastructure to implement it.
Classifying data bases is according to the type of data stored
General-purpose database
A database that contains a wide variety of data used in multiple disciplines.
For example, a census database that contains general demographic data.
Discipline-specific databases
Discipline-specific databases contain data focused on specific subject areas.
The data in this type of database is used mainly for academic or research purposes
within a small set of disciplines.
Examples
Geographic information system (GIS) data bases that store geospatial.
Medical databases that store confidential medical history data.
Classifying databases is based on the intended data usage
Operational database
A database that is designed primarily to support a company’s day-to-day operations is
classified as an operational database, also known as an online transaction processing (OLTP)
database, transactional database, or production database.

Analytical database
A database focuses primarily on storing historical data and business metrics used
exclusively for tactical or strategic decision making.

9
Analytical databases comprise two main components: a data warehouse and an online
analytical processing.
Data Warehouse
A specialized database that stores historical and aggregated data in a format optimized
for decision support.
The data warehouse contains historical data obtained from the operational databases as
well as data from other external sources.
OLAP
Online analytical processing (OLAP) is a set of tools that work together to provide an
advanced data analysis environment for retrieving, processing, and modeling data from the data
warehouse.
Business intelligence
A set of tools and processes used to capture, collect, integrate, store, and analyze data
to support business decision making.
Classification of Database based on reflecting the degree to which the data is structured
Unstructured data
Unstructured data is data that exists in its original, raw state; that is, in the format in
which it was collected.
Structured data
Structured data is data that has been formatted to facilitate storage, use, and information
generation.
Semistructured data
Data that has already been processed to some extent.
An XML database supports the storage and management of semistructured XML data.
Social media
Social media refers to web and mobile technologies that enable “anywhere, anytime,
always on” human interactions.
Websites such as Google, Facebook, Twitter, and LinkedIn capture vast amounts of
data about end users and consumers.
NoSQL (Not only SQL)
NoSQL databases are non-relational database management systems that store and
retrieve data differently than traditional relational databases.
They are designed to handle large volumes of unstructured or semi-structured data.
1.4 Evolution of File System Data Processing
The evolution of file system data processing can be helpful in understanding the data
access limitations that databases attempt to overcome.

10
Reasons for studying file systems:
 Complexity of database design is easier to understand
 Understanding file system problems helps to avoid problems with DBMS
Systems.
 Knowledge of file system is useful for converting file system to database
system.
File systems typically composed of collection of file folders, each tagged and kept in
cabinet.
File System are organized for expected use.
Contents of each file folder are logically related.
1.4.1 Types of File Systems
Manual File Systems
Served as a data repository for small data collections.
Keeping track of data in a manual file system became more difficult as organizations
grew and as reporting requirements became more complex.
Computerized File Systems
Data processing (DP) specialist is the person responsible for developing and managing
a computerized file processing system.
Data processing (DP) specialist converted computer file structure from manual system.
Initially, Computer file systems resembled manual systems.
A number of files increased, file system evolved
Each file used its own application program to store, retrieve and modify data.
Each file was owned by individual or department that commissioned its creation.
Simple File System

1.5 Problems with file system


The file system method of organizing and managing data was a definite improvement
over the manual system, and the file system served a useful purpose in data management
The file system method serves two major purposes

11
• Understanding the shortcomings of the file system enables to understand the
development of modern databases.
• Many of the problems are not unique to file systems. Failure to understand such
problems is likely to lead to their duplication in a database environment, even though database
technology makes it easy to avoid them.
File system data processing in DBMS faces problems like
 Data redundancy,
 Inconsistency,
 Difficulty in accessing data,
 Limited sharing
These issues arise because files are not designed for complex data relationships and
efficient management of large datasets.
 Data Redundancy
The same data can be stored in multiple files, leading to wasted storage space and
potential inconsistencies when updates are made.
 Data Inconsistency
Redundant data can lead to situations where different copies of the same data are
conflicting, causing inaccuracies and errors.
 Difficulty in Accessing Data
File systems often lack sophisticated querying capabilities, making it hard to retrieve
specific information or perform complex data analysis.
 Limited Data Sharing
Data is often stored in separate files and may not be easily shared or accessed by
different applications or users without significant effort.
 Data Integrity Issues
File systems may not have robust mechanisms to enforce data integrity rules, such as
data types or constraints, potentially leading to errors and corruption.
 Security Problems
Protecting sensitive data in a file system can be challenging, especially when multiple
users or applications need access to the same data.

1.6 Database systems


The database system consists of logically related data stored in a single logical data
repository.

12
The database represents a major change in the way end-user data is stored, accessed,
and managed.
1.6.1 Contrasting Database and File Systems

The database’s DBMS provides numerous advantages over file system management,
by making it possible to eliminate most of the file system’s data inconsistency, data anomaly,
data dependence, and structural dependence problems.
The current generation of DBMS software stores not only the data structures, but also
the relationships between those structures and the access paths to those structures all in a central
location.
The current generation of DBMS software also takes care of defining, storing, and
managing all required access paths to those components.
The DBMS is one of several crucial components of a database system. The DBMS is
referred to as the database system’s heart.
1.6.2 The Database System Environment
The term database system refers to an organization of components that define and
regulate the collection, storage, management, and use of data within a database environment.
The database system is composed of the five major parts are,
 Hardware
 Software
 People
 Procedures
 Data

13
Hardware
Hardware refers to all of the system’s physical devices, including computers (PCs,
tablets, workstations, servers, and supercomputers), storage devices, printers, network devices
(hubs, switches, routers, fiber optics), and other devices (automated teller machines, ID readers,
and so on).
Software
Although the most readily identified software is the DBMS itself, three types of
software are needed to make the database system function fully: operating system software,
DBMS software, and application programs and utilities.
 Operating system software manages all hardware components and makes it possible
for all other software to run on the computers.
Examples of operating system software include Microsoft Windows, Linux, Mac OS,
UNIX, and MVS.
 DBMS software manages the database within the database system.
Some examples of DBMS software include Microsoft’s SQL Server, Oracle
Corporation’s Oracle, Oracle’s MySQL, and IBM’s DB2.
 Application programs and utility software are used to access and manipulate data in
the DBMS and to manage the computer environment in which data access and
manipulation take place.
Application programs are most commonly used to access data within the database to
generate reports, tabulations, and other information to facilitate decision making.
Utilities are the software tools used to manage the database system’s computer
components.

14
People
This component includes all users of the database system.
On the basis of primary job functions, five types of users can be identified in a database
system:
 System administrators
 Database administrators
 Database designers
 System analysts and programmers,
 End users
 System administrators oversee the database system’s general operations.
 Database administrators, also known as DBAs, manage the DBMS and ensure that
the database is functioning properly.
 Database designers design the database structure. They are, in effect, the database
architects. If the database design is poor, even the best application programmers and
the most dedicated DBAs cannot produce a useful database environment.
 System analysts and programmers design and implement the application programs.
They design and create the data-entry screens, reports, and procedures through which
end users access and manipulate the database’s data.
 End users are the people who use the application programs to run the organization’s
daily operations.
For example, sales clerks, supervisors, managers, and directors are all classified as end
users.
High-level end users employ the information obtained from the database to make
tactical and strategic business decisions.
Procedures
Procedures are the instructions and rules that govern the design and use of the database
system.
Procedures play an important role in a company because they enforce the standards by
which business is conducted within the organization and with customers.
Procedures also help to ensure that companies have an organized way to monitor and
audit the data that enter the database and the information generated from those data.
Data
The word data covers the collection of facts stored in the database. Because data is the
raw material from which information is generated.

15
1.6.3 DBMS Functions
A DBMS performs several important functions that guarantee the integrity and
consistency of the data in the database.
Most of those functions are transparent to end users, and most can be achieved only
through the use of a DBMS. They include,
 Data dictionary management
 Data storage management
 Data transformation and presentation
 Security management
 Multiuser access control
 Backup and recovery management
 Data integrity management
 Database access languages
 Application programming interfaces
 Database communication interfaces.
Data dictionary management
The DBMS stores definitions of the data elements and their relationships (metadata) in
a data dictionary.
In turn, all programs that access the data in the database work through the DBMS. The
DBMS uses the data dictionary to look up the required data component structures and
relationships.
Any changes made in a database structure are automatically recorded in the data
dictionary, thereby freeing us from having to modify all of the programs that access the changed
structure.
Data storage management
The DBMS creates and manages the complex structures required for data storage, thus
relieving us from the difficult task of defining and programming the physical data
characteristics.
A modern DBMS provides storage not only for the data but for related data-entry forms
or screen definitions, report definitions, data validation rules, procedural code, structures to
handle video and picture formats, and so on.
Data storage management is also important for database performance tuning.
Performance tuning are an activities that make a database perform more efficiently in
terms of storage and access speed.

16
Data transformation and presentation
The DBMS transforms entered data to conform to required data structures.
The DBMS relieves from distinguishing between the logical data format and the
physical data format. That is, the DBMS formats the physically retrieved data to make it
conform to the user’s logical expectations.
Security management
The DBMS creates a security system that enforces user security and data privacy.
Security rules determine which users can access the database, which data items each user can
access, and which data operations (read, add, delete, or modify) the user can perform.
This is especially important in multiuser database systems.
All database users may be authenticated to the DBMS through a username and
password or through biometric authentication such as a fingerprint scan.
The DBMS uses authenticated information to assign access privileges to various
database components such as queries and reports.
Multiuser access control
The DBMS uses sophisticated algorithms to ensure that multiple users can access the
database concurrently without compromising its integrity to provide data integrity and data
consistency.
Backup and recovery management
The DBMS provides backup and data recovery to ensure data safety and integrity.
Current DBMS systems provide special utilities that allow the DBA to perform routine
and special backup and restore procedures.
Recovery management deals with the recovery of the database after a failure, such as a
bad sector in the disk or a power failure.
Data integrity management
The DBMS promotes and enforces integrity rules, thus minimizing data redundancy
and maximizing data consistency.
The data relationships stored in the data dictionary are used to enforce data integrity.
Ensuring data integrity is especially important in transaction-oriented database systems.
Database access languages
The DBMS provides data access through a query language.
A query language is a nonprocedural language that lets the user specify what must be
done without having to specify how.
Structured Query Language (SQL) is the de facto query language and data access
standard supported by the majority of DBMS vendors.
Application programming interfaces

17
The DBMS also provides application programming interfaces to procedural languages
such as COBOL, C, Java, Visual Basic.NET, and C#.
In addition, the DBMS provides administrative utilities used by the DBA and the
database designer to create, implement, monitor, and maintain the database.
Database communication interfaces
A current-generation DBMS accepts end user requests via multiple, different network
environments.
For example, the DBMS might provide access to the database via the Internet through
the use of web browsers such as Mozilla Firefox, Google Chrome, or Microsoft Internet
Explorer.
In this environment, communications can be accomplished in several ways:
 End users can generate answers to queries by filling in screen forms through
their preferred web browser.
 The DBMS can automatically publish predefined reports on a website.
 The DBMS can connect to third-party systems to distribute information via
email or other productivity applications.
Advantages of Database Systems
 Data Integrity and Consistency
 Reduced Data Redundancy
 Improved Data Security
 Enhanced Data Sharing
 Improved Decision Making
 Data Independence
 Data Abstraction
Disadvantages of Database Systems
 Increased Complexity
 Higher Costs
 Potential Performance Overhead
 Scalability Challenges
 Impact of Failures
 Need for Specialized Personnel

18
2. Data models

2.1 Introduction to Data Modeling and Data Model


Data modeling, the first step in designing a database, refers to the process of creating a
specific data model for a determined problem domain.
A data model is a relatively simple graphical representation of more complex real-world
data structures.
A data model in DBMS is a set of concepts and rules that are used to describe and
organize the data in a database.
It defines the structure, relationships, and constraints of the data, and provides a way to
access and manipulate the data.
Within the database environment, a data model represents data structures and their
characteristics, relations, constraints, transformations, and other constructs with the purpose of
supporting a specific problem domain.
An implementation-ready data model should contain at least the following components:
 A description of the data structure that will store the end-user data.
 A set of enforceable rules to guarantee the integrity of the data.
 A data manipulation methodology to support the real-world data
transformations.
2.1.1 Key aspects of data models in DBMS
 Abstraction: They provide a simplified view of the data, hiding the complex details of
storage and retrieval.
 Organization: They define how data is structured, including entities, attributes, and
relationships.
 Relationships: They specify how different data elements are connected to each other.
 Constraints: They define rules and limitations that data must adhere to, ensuring data
integrity.
 Access and Manipulation: They provide a way to interact with the data, enabling users
to query, update, and manage information.
2.2 Importance of Data Model
Data models are crucial for organizing and structuring data within an organization,
facilitating collaboration, improving business processes, and reducing errors.
They provide a visual representation of data elements and their relationships, enabling
clear communication and consistent data management.
2.2.1 Key benefits of data modeling include
 Improved communication

19
 Enhanced collaboration
 Optimized data storage and retrieval
 Reduced errors and inconsistencies
 Improved data quality and integrity
 Streamlined business processes
 Cost savings
 Facilitated data analysis and reporting
2.3 Data Model Basic Building Blocks
A data model is a structure of the data that contains all the required details of the data
like the name of the data, size of the data, relationship with other data and constraints that are
applied on the data. It is a communication tool.
A data model is essential in order to store the database in a sorted manner.
It will provide the interaction between the system analyst, designer and application
programmer.
It improves the understanding of designing of the database in which the organization is
interested.
A data model constitutes of building blocks. They are,
 Entities
 Attributes
 Relationships
 Constraints
2.3.1 Entities
An entity represents a particular type of object in the real world, which means an entity
is “distinguishable” that is, each entity occurrence is unique and distinct.
An entity is a person, place, thing, concept or event about which data will be collected
and stored.
Entities are represented by a rectangle box containing the entity name in it.
Example: Student, employee.
2.3.2 Attributes
It is the set of characteristics of an entity. It is represented by an ellipse symbol with
attribute name on it.
Example: 1. A student has attributes like name, roll number, age and much more.
2. A customer entity would be described by attributes such as customer last name,
customer first name, customer phone number, customer address, and customer credit limit.
2.3.3 Relationships
It describes the association between two entities.

20
It is represented using diamond symbol containing relationship name with it.
The data model generally uses three kinds of relationships:
 One to many
 Many to many
 One to one
One-to-many (1:M or 1..*)
A painter creates many different paintings, but each is painted by only one painter.
Thus, the painter (the “one”) is related to the paintings (the “many”). Therefore, database
designers label the relationship “PAINTER paints PAINTING” as 1:M.
Many-to-many (M:N or *..*)
An employee may learn many job skills, and each job skill may be learned by many
employees. Database designers label the relationship “EMPLOYEE learns SKILL” as M:N.
One-to-one (1:1 or 1..1)
A retail company’s management structure may require that each of its stores be
managed by a single employee. In turn, each store manager, who is an employee, manages only
a single store. Therefore, the relationship “EMPLOYEE manages STORE” is labeled 1:1.
2.3.4 Constraints
A constraint is a restriction placed on the data.
Constraints are important because they help to ensure data integrity.
Constraints are normally expressed in the form of rules:
 An employee’s salary must have values that are between 6,000 and 350,000.
 A student’s GPA must be between 0.00 and 4.00.
 A student can take a maximum of 2 books from the library is applied as a
constraint on the student database.
2.4 Business rules
Business rules, which are constraints and logic defining how a business operates, are
crucial in data models to ensure data accuracy, consistency, and validity.
A business rule is a brief, precise, and unambiguous description of a policy, procedure,
or principle within a specific organization.
Business rules are applied to any organization, large or small a business, a government
unit, a religious group, or a research laboratory that stores and uses data to generate
information.
Business rules derived from a detailed description of an organization’s operations help
to create and enforce actions within that organization’s environment.
Properly written business rules are used to define entities, attributes, relationships, and
constraints.

21
These rules govern data behavior and relationships, impacting database design,
implementation, and maintenance.
They can be expressed in various formats, like natural language, formal notation, or
code.
Purpose
Business rules ensure data quality by enforcing constraints, policies, and logic that
define how the data should be handled.
Examples of business rules are as follows
 A customer may generate many invoices.
 An invoice is generated by only one customer.
 A training session cannot be scheduled for fewer than 10 employees or for more
than 30 employees.
2.4.1 Discovering Business Rules
The main sources of business rules are company managers, policy makers, department
managers, and written documentation such as a company’s procedures, standards, and
operations manuals.
A faster and more direct source of business rules is direct interviews with end users.
The database designer’s job is to reconcile such differences and verify the results of the
reconciliation to ensure that the business rules are appropriate and accurate.
The process of identifying and documenting business rules is essential to database
design for several reasons:
 It helps to standardize the company’s view of data.
 It can be a communication tool between users and designers.
 It allows the designer to understand the nature, role, and scope of the data.
 It allows the designer to understand business processes.
 It allows the designer to develop appropriate relationship participation rules and
constraints and to create an accurate data model.
2.4.2 Translating Business Rules into Data Model Components
Business rules set the stage for the proper identification of entities, attributes,
relationships, and constraints.
Translating business rules into data model components involves identifying the core
entities and relationships within those rules.
Business rules typically map to entities, while indicating relationships map to
associations between those entities. This process ensures the data model accurately reflects the
business's logic and processes.
Steps

22
1. Identify Core Entities
 Nouns: Look for nouns in the business rules that represent key concepts or objects of
interest within the business. For example, in the rule "A customer can place an order,"
"customer" and "order" are nouns and potential entities.
 Data Objects: Consider what data is being stored and manipulated related to these
nouns.
2. Identify Relationships
 Verbs: Analyze verbs (active or passive) that connect nouns in the business rules to
understand the relationships between entities.
 Cardinality: Determine the nature of the relationships (e.g., one-to-one, one-to-many,
many-to-many) based on how frequently one entity is associated with another.
3. Translate into Data Model Components
 Entities: Create entity representations in the data model for each identified core noun
or data object.
 Relationships: Define relationships between entities based on the identified verbs and
their associated cardinalities.
 Attributes: Within each entity, define attributes (data characteristics) that are relevant
to that entity's role in the business rules.
Examples:
Business Rule
"An order must be placed by a customer."
 Entities: Order, Customer
 Relationship: One-to-many (a customer can place many orders)
 Attributes (examples):
 Order: OrderID, CustomerID, OrderDate, TotalAmount, etc.
 Customer: CustomerID, Name, Address, etc.
2.5 Evolution of Data models
The better data management has led to several models that attempt to resolve the
previous model’s critical shortcomings and to provide solutions to ever-evolving data
management needs.

Era Model Key Features Use Cases

Pre-1960s File-based Flat files, no DBMS Simple storage


features

23
1960s Hierarchical Tree structure Banking, telecom

1970s Network Graph structure, pointers Manufacturing,


reservations

1980s–Now Relational Tables, SQL, ACID Business apps, ERP,


CRM

1990s Object-Oriented Object integration, CAD, multimedia


encapsulation

2000s–Now NoSQL Schema-less, horizontal Web apps, IoT, Big Data


scaling

2010s–Now NewSQL SQL + scalability + Cloud-native apps,


ACID analytics

2.5.1 Hierarchical Model

The hierarchical Model is one of the oldest models in the data model which was
developed by IBM, in the 1960s to manage large amounts of data for complex manufacturing
projects.
One of the first and most popular Hierarchical Model is Information Management
System.
In a hierarchical model, data are viewed as a collection of tables, or segments that form
a hierarchical relation.
Hierarchical structure contains levels (or) segments.
In this, the data is organized into a tree-like structure where each record consists of one
parent record and many children.
The relationship can be defined in the form of parent child type.
Example

Advantages
 The design of the hierarchical model is simple

24
 Provides Data Integrity
 Data sharing is feasible
 Fast access
Disadvantages
 Implementation is complex
 Difficult Updates
 Maintenance is difficult
2.5.2 Network Model
The network model was created to represent complex data relationships more
effectively than the hierarchical model, to improve database performance, and to impose a
database standard.
Network model consist of collection of records connected through links.
The relationship can be defined in the form of links and it handles many-to-many
relations. This itself states that a record can have more than one parent.

Advantages
 Easy to design the Network Model
 The model can handle one-one, one-to-many, many-to-many relationships.
 Based on standards and conventions.
Disadvantages
 Complexity since the records is based on pointers and graphs.
 Changes in the database is not easy

2.5.3 Relational Model


The relational model was introduced in 1970 by E. F. Codd of IBM.
The relational model uses a collection of tables to represent both data and the
relationship among those data. Each table has multiple columns and each column has a unique
name.
The relational model is an example of a record-based model.
The database is structured in fixed format records of several types so the name is
Record-based models.

25
Each table contains records of a particular type. Each record type defines a fixed
number of fields or attributes.
Relational Database Management System (RDBMS)
A collection of programs that manages a relational database.
The RDBMS software translates a user’s logical requests (queries) into commands that
physically locate and retrieve the requested data.

Advantages
 Simplicity, mathematical foundation.
 SQL became a standard query language.
2.5.4 Object-Oriented Model (1980s – 1990s)
Object-Oriented Model which integrates database capabilities with object-oriented
programming.
In the Object-Oriented Data Model, data and their relationships are contained in a single
structure which is referred to as an object in this data model.
In this, real-world problems are represented as objects with different attributes. All
objects have multiple relationships between them.
Basically, it is a combination of Object Oriented programming and a Relational
Database Model.
Components of OO data model
 An object is an abstraction of a real-world entity.
 Attributes describe the properties of an object.
 A class is a collection of similar objects with shared structure (attributes) and
behavior (methods).
 Inheritance is the ability of an object within the class hierarchy to inherit the attri
butes and methods of the classes above it.
 Object-oriented data models are typically depicted using Unified Modeling
Language (UML) class diagrams.
 Strengths: Good for complex data (e.g., CAD, multimedia).
 Weaknesses: Limited commercial adoption; complex integration.
2.5.5 Entity Relationship Model (E-R Model)
The model was developed by Peter Chen in 1976.

26
The E-R data model is based on a perception of a real world that consists of a collection
of basic objects called entities and relationships among these objects.
Entity relationship model based on
 Entities and their attributes
 Relationship among entities
The entity-relationship model is widely used in database design.
A data model that describes relationships (1:1, 1:M, and M:N) among entities at the
conceptual level with the help of ER diagrams.
ER models are normally represented in an entity relationship diagram (ERD), which
uses graphical representations to model database components.
Components of E-R Model
 Entity: An entity was defined as anything about which data will be collected and stored.
 Attributes: Each entity consists of a set of attributes that describes particular
characteristics of the entity.
 Relationships: Relationships describe associations among data. Most relationships
describe associations between two entities.
One to Many (1:M) Relationship: A Painter can paint many paintings.

One to One (1:1) Relationship: An employee manages one Store.

2.5.6 NoSQL Models (2000s – Present)


NoSQL is a large-scale distributed database system that stores structured and
unstructured data in efficient ways.
NoSQL databases provide distributed, fault-tolerant databases for processing
nonstructured data.
A new generation of database management systems that is not based on the traditional
relational database model.
Types of NoSQL databases
 Key-Value Stores (e.g., Redis)
 Document Stores (e.g., MongoDB)
 Column-Family Stores (e.g., Cassandra)
 Graph Databases (e.g., Neo4j)

27
NoSQL to refer to a new generation of databases that address the specific challenges of
the Big Data era and have the following general characteristics:
 They are not based on the relational model and SQL, hence the name NoSQL.
 They support distributed database architectures.
 They provide high scalability, high availability, and fault tolerance.
 They support very large amounts of sparse data.
 They are geared toward performance rather than transaction consistency.
2.5.7 NewSQL (2010s – Present)
NewSQL is a data model which combines SQL-based querying with NoSQL like
scalability.
Examples: Google Spanner, CockroachDB
Strengths: ACID compliance + distributed performance
2.6 Degrees of Data Abstraction

2.6.1 The External Model


The external model is the end users’ view of the data environment.
The term end users refer to people who use the application programs to manipulate the
data and generate information.
End users usually operate in an environment in which an application has a specific
business unit focus.
ER diagrams will be used to represent the external views. A specific representation of
an external view is known as an external schema.
Each external schema includes the appropriate entities, relationships, processes, and
constraints imposed by the business unit.
Advantages of external view
 It is easy to identify specific data required to support each business unit’s operations.

28
 It makes the designer’s job easy by providing feedback.
 It helps to ensure security constraints in the database design.
 It makes application program development much simpler.
2.6.2 The Conceptual Model
The conceptual model represents a global view of the entire database by the entire
organization.
The conceptual model integrates all external views (entities, relationships, constraints,
and processes) into a single global view of the data in the enterprise.
Also known as a conceptual schema, it is the basis for the identification and high-level
description of the main data objects.
The most widely used conceptual model is the ER model. The ERD is used to
graphically represent the conceptual schema.
Advantages
 It provides a macro level view of the data environment that is relatively easy to
understand.
 The conceptual model is independent of both software and hardware.
Software independence means that the model does not depend on the DBMS software
used to implement the model.
Hardware independence means that the model does not depend on the hardware used
in the implementation of the model.
The term logical design refers to the task of creating a conceptual data model that could
be implemented in any DBMS.
2.6.3 The Internal Model
Once a specific DBMS has been selected, the internal model maps the conceptual model
to the DBMS.
The internal model is the representation of the database as “seen” by the DBMS.
The internal model requires the designer to match the conceptual model’s
characteristics and constraints to those of the selected implementation model.
An internal schema depicts a specific representation of an internal model, using the
database constructs supported by the chosen database.
A relational database was chosen to implement the internal model. Therefore, the
internal schema should map the conceptual model to the relational model constructs.
In particular, the entities in the conceptual model are mapped to tables in the relational
model.
The internal schema is expressed using SQL, the standard language for relational
databases.

29
Logical independence
A condition in which the internal model can be changed without affecting the
conceptual model.
2.6.4 Physical model
Physical model is a model in which physical characteristics such as location, path, and
format are described for the data.
It operates at the lowest level of abstraction, describing the way data is saved on storage
media such as magnetic, solid state, or optical media.
It requires the definition of both the physical storage devices and the (physical) access
methods required to reach the data within those storage devices, making it both software and
hardware dependent.
The physical model is both hardware- and software dependent.
Physical independence
Physical independence is a condition in which the physical model can be changed
without affecting the internal model.
The levels of data abstraction are summarized as

UNIT – II

30
DESIGN CONCEPTS
3. Relational Database Model
3.1 Logical view of data
In a relational database model, the logical view of data refers to how data is organized
and structured conceptually, regardless of how it's physically stored.
This view emphasizes the relationships between data entities and how they're
represented in tables (relations).
Data is organized into rows (tuples) and columns (attributes), with each row
representing a specific instance of an entity and each column representing a property of that
entity.
3.1.1 Tables and Their Characteristics
The logical view of the relational database is facilitated by the creation of data
relationships based on a logical construct known as a relation.
A table is a persistent representation of a logical relation that is, a relation whose
contents can be permanently saved for future use.
A table contains a group of related entity occurrences that is, an entity set.
For example, a STUDENT table contains a collection of entity occurrences, each
representing a student.
Characteristics of a Relational Table
 Relatons: A table is perceived as a two-dimensional structure composed of rows
and columns.
 Tuple: Each row in a relation represents a single record or instance of an entity.
 Attributes: Each table column represents an attribute, and each column has a
distinct name. It represents a specific property or characteristic of the entity.
 Cell: Each intersection of a row and column represents a single data value.
 All values in a column must conform to the same data format.
 Each column has a specific range of values known as the attribute domain.
 Keys: Each table must have an attribute or combination of attributes that uniquely
identifies each row in a table (e.g., primary key) and to establish relationships
between tables (e.g., foreign key).
 Data Types: Data is stored in specific data types (e.g., text, numbers, dates) within
the columns of the table.

3.1.3 Benefits of the Logical View

31
 Abstraction
 Data Integrity
 Flexibility
 Simplified Access
3.2 Keys
In the relational model, keys are important because they are used to ensure that each
row in a table is uniquely identifiable.
They are also used to establish relationships among tables and to ensure the integrity of
the data.
A key consists of one or more attributes that determine other attributes.
Key which ensures no two tuples in a relation are allowed to have exactly the same
value for all attributes.
3.2.1 Dependencies
The role of a key is based on the concept of determination.
Determination is the state in which knowing the value of one attribute makes it possible
to determine the value of another.
The idea of determination is not unique to the database environment.
Functional dependence
A specific terminology and notation is used to describe relationships based on
determination. The relationship is called functional dependence, which means that the value of
one or more attributes determines the value of one or more other attributes.
The standard notation for representing the relationship between STU_NUM and
STU_LNAME is as follows:
STU_NUM → STU_LNAME
In this functional dependency, the attribute whose value determines another is called
the determinant or the key.
The attribute whose value is determined by the other attribute is called the dependent.
STU_ NUM is the determinant and STU_LNAME is the dependent.
STU_NUM functionally determines STU_LNAME, and STU_LNAME is functionally
dependent on STU_NUM.
3.2.2 Types of Keys
A composite key is a key that is composed of more than one attribute.
An attribute that is a part of a key is called a key attribute.

For example,

32
STU_NUM → STU_GPA
(STU_LNAME, STU_FNAME, STU_INIT, STU_PHONE) → STU_HRS
In the first functional dependency, STU_NUM is an example of a key composed of
only one key attribute.
In the second functional dependency, (STU_LNAME, STU_FNAME, STU_INIT,
STU_PHONE) is a composite key composed of four key attributes.
Super key
A super key is a set of one or more attributes that uniquely identity a tuple in the
relation.
For example, the customer_id attribute of the relation customer is sufficient to
distinguish one customer tuple from another. Thus, customer_id is a super key.
Candidate key
Candidate key is a minimal superkey that is, a key that does not contain a subset of
attributes that is itself a superkey.
Primary key
Primary key to denote a candidate key that is chosen by the database designer to
identifying entities within an entity set.
Foreign key
A column in one table that references the primary key of another table, establishing a
link between the tables.
A relation schema, say r1 may include among its attributes the primary key of another
relation schema say r2. This attribute is called a foreign key from r1 referencing r2.
The relation r1 is also called the referencing relation of the foreign key dependency and
r2 is called the referenced relation of the foreign key.
A database schema, along with primary key and foreign key dependencies can be
depicted pictorially by schema diagrams.
Each relation appears as a box, with the attributes listed inside it and the relation name
above it.

If there are primary key attributes, a horizontal line crosses the box, with the primary
key attributes listed above the line.
33
Foreign key dependencies appear as arrows from the foreign key attributes of the
referencing relation to the primary key of the referenced relation.
Secondary key
An attribute or combination of attributes used strictly for data retrieval purposes.
3.3 Integrity Rules
Relational database integrity rules are very important to good database design.
RDBMSs enforce integrity rules automatically.
Integrity rules are a set of constraints used in Database Management Systems to ensure
that the data in a database is accurate, consistent and reliable.
These rules helps in maintaining the quality of data by ensuring that the processes like
adding, updating or deleting information do not harm the integrity of the database.
Integrity rules act as guidelines to protect the database and keep it error-free.
3.3.1 Types of Constraints
 NOT NULL
 UNIQUE
 DEFAULT
 CHECK
 Key Constraints – PRIMARY KEY, FOREIGN KEY
 Domain constraints
 Mapping constraints
NOT NULL
NOT NULL constraint makes sure that a column does not hold NULL value.
When we don’t provide value for a particular column while inserting a record into a
table, it takes NULL value by default.
By specifying NULL constraint, we can be sure that a particular column(s) cannot have
NULL values.
Example
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (235),
PRIMARY KEY (ROLL_NO)
);
UNIQUE
UNIQUE Constraint enforces a column or set of columns to have unique values.

34
If a column has a unique constraint, it means that particular column cannot have
duplicate values in a table.
Example
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL UNIQUE,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (35) UNIQUE,
PRIMARY KEY (ROLL_NO)
);
DEFAULT:
The DEFAULT constraint provides a default value to a column when there is no value
provided while inserting a record into a table.
Example
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
EXAM_FEE INT DEFAULT 10000,
STU_ADDRESS VARCHAR (35),
PRIMARY KEY (ROLL_NO)
);
CHECK
This constraint is used for specifying range of values for a particular column of a table.
When this constraint is being set on a column, it ensures that the specified column must have
the value falling in the specified range.
Example
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL CHECK(ROLL_NO >1000) ,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
EXAM_FEE INT DEFAULT 10000,
STU_ADDRESS VARCHAR (35),
PRIMARY KEY (ROLL_NO)
);
In the above example we have set the check constraint on ROLL_NO column of
STUDENT table. Now, the ROLL_NO field must have the value greater than 1000.

35
Key Constraints
PRIMARY KEY
Primary key uniquely identifies each record in a table. It must have unique values and
cannot contain nulls.
In the below example the ROLL_NO field is marked as primary key, that means the
ROLL_NO field cannot have duplicate and null values.
Example
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL UNIQUE,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (35) UNIQUE,
PRIMARY KEY (ROLL_NO)
);
FOREIGN KEY
Foreign keys are the columns of a table that points to the primary key of another table.
They act as a cross-reference between tables.
Domain Constraints
Each table has certain set of columns and each column allows a same type of data, based
on its data type. The column does not accept values of any other data type.
Domain constraints are user defined data type and we can define them like this:
Domain Constraint = data type + Constraints (NOT NULL / UNIQUE / PRIMARY
KEY / FOREIGN KEY / CHECK / DEFAULT)
Mapping Constraints
Mapping constraints can be explained in terms of mapping cardinality.
Mapping cardinalities or cardinality ratios express the number of entities to which
another entity can be associated via a relationship set.
Mapping cardinalities are most useful in describing binary relationship sets.
For a binary relationship set R between entity sets A and B, the binary cardinality must
be one of the following.
 One to one relation
 One to many relation
 Many to one relation
 Many to many relation
One-to-one

36
An entity in A is associated with at most one entity in B and an entity in B is associated
with at most one entity in A.

One-to-many
An entity in A is associated with any number of entities in B. An entity in B can be
associated with at most one entity in A.

Many-to-one
An entity in A is associated with at most one entity in B. An entity in B can be associated
with any number of entities in A.

Many-to-many
An entity in A associated with at most one entity in B. An entity in B can be associated
with any number of entities in A.

The appropriate mapping cardinality for a particular relationship set depends on the
real-world situation that the relationship set is modeling.

3.4 Relational Set Operators

37
Set operations are used to join the result of two (or) more SELECT statements. The set
operations are,
1. Union Operation (∪)
2. Intersect Operation (∩)
3. Except Operation (-)
Example

4.3.1 Union Operation


Union operation is used to combine the result of two (or) more SELECT statement. It
removes duplicate rows from its result set.
Example
Select * from first
Union
Select * from second
Result

If you want to retain all duplicates, we must write union all in place of union,
Example
Select * from first
Union all
Select * from second
Result

4.3.2 Intersect Operation


Intersect operation is used to combine two SELECT statement, but it only returns the
records which are common from both SELECT statements.
Example

38
Select * from first
Intersect
Select * from second
Result

The intersect operation automatically eliminates duplicates.


If you want to retain all duplicates, we must write intersect all in the place of intersect
Example
Select * from first
Intersect all
Select * from second
4.3.3 Except Operation
Except operation combines result of two statements and return only that result which
belongs to first set of result.
Example
Select * from first
Except
Select * from second
Result

The except operation automatically eliminates duplicates.


If you want to retain all duplicates, we must write except all in the place of except.
Example
Select * from first
Except all
Select * from second
3.5 Data Dictionary and the System Catalog
In DBMS, data dictionary and system catalogs are both repositories of metadata, but
they serve different purposes.
A data dictionary is a more comprehensive, often human-managed document that
provides detailed information about data elements, their relationships, and usage. Conversely,
a system catalog is a DBMS-maintained database that stores technical details about the
database structure and its objects, like tables and columns.
Data Dictionary

39
The data dictionary provides a detailed description of all tables in the database created
by the user and designer.
The data dictionary contains at least all of the attribute names and characteristics for
each table in the system.
In short, the data dictionary contains metadata - data about data.
Data dictionary is a DBMS component that stores metadata. Thus, the data dictionary
contains the data definition as well as their characteristics and relationships.
A data dictionary may also include data that are external to the DBMS.
It also known as an information resource dictionary.
The data dictionary is sometimes described as “the database designer’s database”
because it records the design decisions about tables and their structures.
It may include business definitions, data types, formats, constraints, and other metadata
related to data elements.
Purpose
Facilitates communication and understanding of data among different users, including
technical and business users.
System catalog
The system catalog can be described as a detailed system data dictionary that describes
all objects within the database.
It includes data about table names, table’s creator and creation date, number of columns
in each table, data type corresponding to each column, index filenames, index creators,
authorized users, and access privileges.
The system catalog is actually a system-created database whose tables store the
user/designer-created database characteristics and contents.
The system catalog automatically produces database documentation.
Purpose
Enables the DBMS to perform operations like query optimization, data access, and
security enforcement.
As new tables are added to the database, that documentation also allows the RDBMS
to check for and eliminate homonyms and synonyms.
Homonym
The use of the same name to label different attributes.
Homonyms generally should be avoided.
Some relational software automatically checks for homonyms and either alerts the user
to their existence or automatically makes the appropriate adjustments.
Synonym

40
A synonym is the opposite of a homonym, and indicates the use of different names to
describe the same attribute.
Difference between Data Dictionary and System catalog
Data Dictionary System catalog
It Maintained by humans It managed by the DBMS itself
It may cover multiple databases or even It specific to a single database
entire data ecosystems
Used by both technical and business users Used by the DBMS
It help humans understand and manage data It help the DBMS manage and access data

3.6 Relationships within the Relational Database


In a relational database, relationships are associations between tables, enabling the
linking of data across different tables.
These relationships are typically established using foreign keys, which reference the
primary keys of other tables.
This allows for structured and meaningful connections between data, ensuring data
integrity and facilitating complex queries.
Types of Relationships
 One-to-one (1:1)
Each record in one table relates to only one record in another table.
Example: An employee table and an employee_details table. Each employee record has
one corresponding details record.
The 1:1 Relationship between Professor and Department

A professor can chair only one department.


 One-to-many (1:M)
One record in one table can relate to multiple records in another table.
Example: A customer table and an orders table. One customer can place multiple
orders, but each order is associated with only one customer.

The 1:M Relationship between Painter and Painting

41
Each painting was created by one and only one painter, but each painter could have
created many paintings.
 Many-to-many (M:M)
Multiple records in one table can relate to multiple records in another table. This type
of relationship often requires a linking table.
Example: A student table and a course table. One student can enroll in multiple courses,
and one course can have multiple students enrolled.
The M:M Relationship between Student and Class

Each CLASS can have many STUDENTs, and each STUDENT can take many
CLASSes.
Benefits of Relationships
 Data Integrity
 Enable powerful queries
 Structured and organized data storage
3.7 Data Redundancy Revisited
Data redundancy refers to the duplication of data in a database or system.
Data redundancy occurs when the same piece of data is stored in multiple locations,
such as multiple databases, files, or systems.
Data redundancy leads to data anomalies, which can destroy the effectiveness of the
database.
The relational database control data redundancies by using common attributes that are
shared by tables, called foreign keys.
The proper use of foreign keys is crucial to controlling data redundancy, although they
do not totally eliminate the problem because the foreign key values can be repeated many times.
However, the proper use of foreign keys minimizes data redundancies.

Causes of data redundancy

42
 Data duplication: Entering same data multiple times.
 Lack of normalization: Poor database design.
 Data integration: Combining data from multiple sources.
Consequences of data redundancy
 Data inconsistencies: Updates to one copy may not reflect in others.
 Storage waste: Duplicate data occupies unnecessary storage space.
 Data maintenance challenges: Updating redundant data can be complex.
Solutions to minimize data redundancy
 Data normalization: Design databases to eliminate redundancy.
 Use data validation: Ensure data consistency.
 Implement data integration strategies: Synchronize data across systems.
 Single source of truth: Identify a primary source for each data element.
Benefits of minimizing data redundancy
 Improved data integrity: Ensures data consistency and accuracy.
 Reduced storage costs: Eliminates unnecessary storage usage.
 Simplified data maintenance: Updates are easier to manage.
3.8 Indexes
An index is an orderly arrangement used to logically access rows in a table.
The index is used to locate a needed item quickly and making retrieval a quick and
simple matter.
From a conceptual point of view, an index is composed of an index key and a set of
pointers.
The index key is the index’s reference point. Also, an index key can be composed of
one or more attributes.
An index is an ordered arrangement of keys and pointers. Each key points to the location
of the data identified by the key.
A table can have many indexes, but each index is associated with only one table.
For example, suppose we want to look up all of the paintings created by a given painter
in the database.
Without an index, we must read each row in the PAINTING table and see if the
PAINTER_NUM matches the requested painter.
However, if we index the PAINTER table and use the index key PAINTER_ NUM,
which need to look up the appropriate PAINTER_NUM in the index and find the matching
pointers.

Components of an Index

43
Note that the first PAINTER_NUM index key value (123) is found in records 1, 2, and
4 of the PAINTING table. The second PAINTER_NUM index key value (126) is found in
records 3 and 5 of the PAINTING table.
Purpose of using an index in DBMS
 An index can be used to retrieve data more efficiently.
 Indexes play an important role in DBMSs for the implementation of primary keys.
 Faster query execution.
 Improve overall database performance.

3.9 Codd's Relational Database rules


In 1985, Dr. E. F. Codd published a list of 12 rules to define a relational database
system. He published the list out of concern that many vendors were marketing products as
“relational” even though those products did not meet minimum relational standards.
Dr. Codd’s rules are a frame of reference for what a truly relational database should be.
Bear in mind that even the dominant database vendors do not fully support all 12 rules.
Dr. Codd’s 12 Relational Database Rules
Rule Rule Name Description
1 Information All information in a relational database must be
logically represented as column values in rows
within tables.
2 Guaranteed access Every value in a table is guaranteed to be accessible
through a combination of table name, primary key
value, and column name.
3 Systematic treatment of nulls Nulls must be represented and treated in a
systematic way, independent of data type

44
4 Dynamic online catalog The metadata must be stored and managed as
based on the relational model ordinary data—that is, in tables within the database;
such data must be available to authorized users
using the standard database relational language.
5 Comprehensive data The relational database may support many
sublanguage languages; however, it must support one well-
defined, declarative language as well as data
definition, view definition, data manipulation,
integrity constraints, authorization, and transaction
management.
6 View updating Any view that is theoretically updatable must be
updatable through the system.
7 High-level insert, update, and The database must support set-level inserts,
delete updates, and deletes.
8 Physical data independence Application programs and ad hoc facilities are
logically unaffected when physical access methods
or storage structures are changed.
9 Logical data independence Application programs and ad hoc facilities are
logically unaffected when changes are made to the
table structures that preserve the original table
values.
10 Integrity independence All relational integrity constraints must be definable
in the relational language and stored in the system
catalog, not at the application level.
11 Distribution independence The end users and application programs are
unaware of and unaffected by the data location.
12 Non subversion If the system supports low-level access to the data,
users must not be allowed to bypass the integrity
rules of the database.
13 Rule zero All preceding rules are based on the notion that to
be considered relational, a database must use its
relational facilities exclusively for management.

45
4. Entity relationship model
4.1 ER diagram
Entity relationship model [E-R model] is a way of graphically representing the logical
relationship of entities (or) object to create database.
The E-R model is very useful in mapping the meaning and interactions of real world
enterprises onto a conceptual schema.
The E-R data model employs three basic notations,
1) Entity sets
2) Relationship sets
3) Attributes.
The E-R data model consists of collection of basic objects called entities and
relationship among these objects.

Entity
An entity can be a real world object that can be easily identified.
Example
School databaseStudents, Teachers, Classes and Courses
Entity Sets
An entity set is a collection of similar types of entities. An entity is a “thing” or “object”
in the real world that is distinguishable from all other objects.
An entity set may contain entities with attributes sharing similar values.
Example
Student set may contain all the students of a school.
Teacher set may contain all the teachers of a school.
A bank database consists of two entity sets, customer and loan.

46
Relationship
The association among entities is called relationship.
Example
Employee work at a department.
Student enrolls in a course.
Here, work at and enrolls are called relationship.
Relationship sets
A set of relationships of similar type is called relationship set.
If E1, E2,....En are entity sets, then a relationship set R is a subset of
{ (e1,e2,.....,en) | e1 ϵ E1, e2 ϵ E2,....en ϵ En }
Where (e1, e2....en)is a relationship.
Attributes
Entities are represented by means of their properties, called attributes. All attributes
have values.
Example
A student entity may have,

A set of permitted values for each attribute is called the domain or value set of that
attributes.
Types of attributes
 Simple attribute
 Composite attribute
 Derived attribute
 Single value attributes
 Multi value attributes
Simple attribute
Simple Attribute cannot be divided in to subparts.
Composite Attribute
Composite Attribute can be divided in to subparts.
For example, an attribute name could be structured as a composite attributes consisting
of first_name, middle_name, last_name.

47
In the composite attribute address, its component attribute street can be further divided
in to street_number, street_name and apartment_number.
Derived Attribute
The value for this type of attribute can be derived from the values of other related
attributes or entities.
For example, the customer entity set has an attribute age that indicates the customer’s
age.
If the customer entity set also has an attribute date of birth we can calculate age from
date of birth thus, age is a derived attribute.
An attribute takes a null value when an entity does not have a value for it.
Single-valued and Multi-valued Attributes
The attribute have a single value for a particular entity.
For example, the loan_number attribute for a specific loan entity refers to only one loan
number, such attribute are said to be Single valued.
An attribute has a set of values for a specific entity.
An employee entity set with the attribute phone_number. ie, an employee may have
zero, one or several phone numbers and different employees may have different numbers of
phones this types of attribute is said to be Multivalued.
E – R diagram can express the overall logical structure of a database graphically.
E-R diagram consist of the following major components,
Entity
Rectangle box are used to represent entity and name of the entity is written inside the
rectangle.

Weak entity
Weak entity is represented by double rectangles.

CLASS

48
Attributes
An ellipse shape is used to represent the attribute and name of the attribute is written
inside the oval shape.

REG. NO

Multivalue Attributes
Multivalue attributes are represented by double ellipses.

PHONE

Derived Attributes
Derived attributes are represented by dashed ellipse.

AGE

Primary key
Primary key attribute is represented by the attribute name with underline.

STUDENT ID

Relationship
A diamond shape is used to represent the relationship between the entities.
Relationship name will be written inside the shape.
The Lines are used to link attributes to entity sets and entity sets to relationship sets.

Weak relationship
Weak relationship is represented by double diamond shape.

Relationship types
One to one relation

49
One to many relation

Many to one relation

Many to many relation

Example

50

You might also like