0% found this document useful (0 votes)
33 views20 pages

Introduction to Database Management Systems

Uploaded by

hellorishu8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views20 pages

Introduction to Database Management Systems

Uploaded by

hellorishu8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

UNIT-I

Introduction to DBMS (Database Management System)

A Database Management System (DBMS) is a software system designed to


efficiently manage, store, retrieve, and manipulate data in databases. It
provides an interface between the user and the database, ensuring data
integrity, security, and consistency. DBMS allows for the creation,
maintenance, and administration of databases, and it simplifies complex data
management tasks.

Key features of a DBMS include:

• Data Redundancy Control: Avoids duplication of data.


• Data Integrity: Ensures accuracy and consistency.
• Data Security: Protects sensitive data through access controls.
• Concurrent Access: Manages multiple users accessing data
simultaneously.

Data Modeling for a Database

Data modeling is the process of defining and organizing the structure of a


database, focusing on how data elements relate to each other. This conceptual
framework helps in visualizing the data flow and supports in designing a well-
structured database.

The primary goal of data modeling is to create a blueprint for the database that
clearly defines:

• Entities: Objects or concepts that represent real-world things (e.g.,


Employee, Customer).
• Attributes: Properties that describe entities (e.g., Employee Name,
Customer Address).
• Relationships: How entities are related to each other (e.g., An Employee
works for a Department).

Three-Level Architecture of DBMS


The three-level architecture of DBMS defines the way data is viewed and
manipulated in a database. It comprises three levels:

1. Internal Level (Physical Level):


o Describes how data is physically stored in the system (e.g., storage
devices, indexing).
o It focuses on the optimization of data storage and retrieval.
2. Conceptual Level (Logical Level):
o Describes what data is stored and the relationships between those
data.
o It hides the complexities of the internal level from users and
provides a higher level of abstraction.
3. External Level (View Level):
o Describes how users interact with the data (e.g., through user
interfaces or specific queries).
o Users only see what is relevant to them without needing to know
the details of the internal or conceptual levels.

This architecture ensures data independence, meaning that changes at one


level (e.g., physical storage) do not affect other levels (e.g., user access).

Components of a DBMS

A DBMS consists of several components that work together to provide efficient


data management:

1. Database Engine: The core service for accessing and processing the
database.
2. Database Schema: The logical structure that defines the organization of
data.
3. Query Processor: Translates user queries into executable commands.
4. Transaction Manager: Ensures database consistency by handling
transactions (group of operations).
5. Security Manager: Manages authentication, access control, and
encryption.
6. Data Dictionary: Stores metadata about the database, such as schema,
constraints, etc.
7. Backup and Recovery: Ensures the database can be restored to a
previous state in case of failure.
Introduction to Data Models

A data model is a conceptual framework that defines the structure,


organization, and relationships of data within a database. It provides a way to
describe the data and its relationships in a formalized manner. There are
several types of data models used in DBMS:

Hierarchical Model

The Hierarchical Model organizes data in a tree-like structure, where each


record has a single parent and potentially many children. It is simple and fast
for operations that follow the tree structure but is not flexible for representing
complex relationships.

Example:

• Parent → Child
• Department → Employee

Network Model

The Network Model extends the hierarchical model by allowing multiple


relationships between records, thus enabling more complex data relationships.
It uses a graph structure with records connected by pointers, which allows a
record to have multiple parents.

Example:

• Department ↔ Employee (with multiple connections)

Relational Model

The Relational Model organizes data into tables (relations), where each table
contains rows (records) and columns (attributes). This model uses keys
(primary and foreign) to establish relationships between tables. It is the most
popular model due to its simplicity and flexibility.

Example:

• Employee (ID, Name, DepartmentID)


• Department (DepartmentID, DepartmentName)

Comparison of Network, Hierarchical, and Relational Models

Hierarchical
Feature Network Model Relational Model
Model
Structure Tree-like Graph-like Table-based
Data Many-to-many,
One-to-many Many-to-many
Relationships flexible
More flexible than
Flexibility Limited Highly flexible
hierarchical
Complex to Easy to use,
Ease of Use Complex to manage
manage intuitive
Data
High Moderate Low
Redundancy

Entity Relationship Model (ERM)

The Entity Relationship Model (ERM) is a high-level conceptual data model


that describes the entities within a domain and the relationships between
them. It uses:

• Entities: Real-world objects or concepts (e.g., Employee).


• Attributes: Properties of entities (e.g., Name, Age).
• Relationships: Describes how entities are related (e.g., An Employee
works in a Department).

The ER model is often represented through an ER diagram that visually depicts


entities, their attributes, and relationships.
UNIT-II

Relational Database

A relational database is a type of database that stores data in tables (also


called relations). Each table consists of rows (records) and columns
(attributes). Tables in a relational database are linked using keys, and
relationships between different tables are established through foreign keys.
This model is based on the relational model of data, introduced by Edgar
Codd.

Key characteristics of relational databases:

• Data Integrity: Ensures data consistency and accuracy.


• Normalization: Reduces redundancy by organizing data into multiple
related tables.
• Flexibility: Supports complex queries and dynamic updates.

Common examples of relational database management systems (RDBMS)


include MySQL, Oracle, Microsoft SQL Server, and PostgreSQL.

Relational Algebra and Calculus

Both relational algebra and relational calculus are formal languages used to
query and manipulate relational databases. They are foundational to SQL
operations and represent mathematical approaches to database queries.

1. Relational Algebra:
o Relational algebra is a procedural query language that uses
operators to perform operations on tables (relations).
o Key operators in relational algebra:
▪ Select (σ): Filters rows based on a condition (similar to the
SQL WHERE clause).
▪ Project (π): Selects specific columns from a table (similar to
the SQL SELECT clause).
▪ Union (∪): Combines the results of two queries with the
same attributes.
▪ Difference (−): Retrieves rows from one table that are not in
another.
▪ Cartesian Product (×): Combines two tables by creating all
possible pairs of rows.
▪ Join (⨝): Combines related rows from two tables based on
a common attribute.
2. Relational Calculus:
o Relational calculus is a declarative query language, focusing on
what to retrieve rather than how to retrieve it.
o There are two types:
▪ Tuple Relational Calculus (TRC): Uses variables that
represent tuples (rows).
▪ Domain Relational Calculus (DRC): Uses variables that
represent individual fields (columns).
o Queries are written in the form of logical expressions rather than
operations.

SQL Fundamentals

SQL (Structured Query Language) is the standard language used for managing
and manipulating relational databases. SQL is used to create, retrieve, update,
and delete data in a database. It is divided into several sublanguages:

1. DDL (Data Definition Language): Defines and manages the structure of


database objects.
o CREATE: Used to create database objects (tables, indexes, etc.).
o ALTER: Modifies the structure of an existing database object.
o DROP: Deletes an existing database object.
o TRUNCATE: Removes all rows from a table but does not remove
the table itself.
2. DML (Data Manipulation Language): Deals with data manipulation and
retrieval.
o SELECT: Retrieves data from a table.
o INSERT: Adds new records to a table.
o UPDATE: Modifies existing records.
o DELETE: Removes records from a table.
3. DCL (Data Control Language): Manages permissions and access control.
o GRANT: Grants specific privileges to users.
o REVOKE: Removes specific privileges from users.
PL/SQL Concepts

PL/SQL (Procedural Language/SQL) is Oracle's extension to SQL that allows for


procedural programming. PL/SQL enables users to write complex logic such as
loops, conditionals, and exception handling, which is not possible in standard
SQL.

Key concepts of PL/SQL:

• Blocks: PL/SQL code is written in anonymous or named blocks. A block


consists of declarations, executable commands, and exception handlers.
• Variables: PL/SQL allows for the use of variables to store temporary
data.
• Control Structures: PL/SQL supports loops, conditional statements (IF-
THEN-ELSE), and other control mechanisms.

Cursors

A cursor is a pointer to a result set of a query. It allows you to process


individual rows of a query result one at a time. Cursors are essential when
dealing with queries that return multiple rows and need to be processed
sequentially.

There are two types of cursors:

1. Implicit Cursor: Automatically created by the DBMS when a query is


executed (e.g., a SELECT INTO statement).
2. Explicit Cursor: Defined by the programmer to handle more complex
operations, giving control over the result set.

Cursors support the following operations:

• OPEN: Executes the query and allocates memory for the cursor.
• FETCH: Retrieves the next row from the cursor's result set.
• CLOSE: Releases the resources allocated to the cursor.

Stored Procedures
A stored procedure is a precompiled collection of SQL statements and logic
that is stored in the database. Stored procedures can be invoked by users or
applications to perform specific tasks, such as inserting data, updating records,
or handling business logic.

Benefits of stored procedures:

• Performance: Since they are precompiled, stored procedures execute


faster than individual SQL queries.
• Security: Stored procedures can limit direct access to underlying tables
and data.
• Reusability: Stored procedures can be reused across applications and
users.

Stored Functions

A stored function is similar to a stored procedure, but it is specifically designed


to return a single value. It is often used for calculations or querying values.

Key differences between stored procedures and stored functions:

• A stored procedure does not return a value (though it can use output
parameters).
• A stored function always returns a value (typically used in expressions,
SELECT queries, etc.).

Stored functions are commonly used to perform operations such as


aggregations or complex calculations, returning the result directly to the calling
program.

Database Triggers

A trigger is a set of SQL statements that are automatically executed or "fired"


when certain events occur in the database. Triggers are associated with a
particular table and can be set to activate before or after INSERT, UPDATE, or
DELETE operations.

Types of triggers:
1. BEFORE Trigger: Fires before the operation (e.g., before inserting a new
record).
2. AFTER Trigger: Fires after the operation (e.g., after deleting a record).
3. INSTEAD OF Trigger: Replaces the operation with a new action
(commonly used for views).

Triggers are useful for enforcing business rules, auditing database changes, and
maintaining data consistency automatically.

UNIT-III

Introduction to Normalization

Normalization is the process of organizing data in a relational database to


reduce redundancy and improve data integrity. The goal of normalization is to
ensure that data is stored in such a way that it can be easily updated,
retrieved, and maintained without anomalies. This process involves dividing
large tables into smaller, more manageable ones and defining relationships
between them. By doing so, normalization eliminates unwanted dependencies
and ensures data consistency.

Normalization typically involves dividing a database into multiple tables based


on certain rules, which are defined as normal forms.

First Normal Form (1NF)

A table is in First Normal Form (1NF) if:

• All columns contain atomic (indivisible) values.


• Each column contains only one value per row (i.e., no repeating groups
or arrays).
• The order in which data is stored does not matter.

Example:

Student_ID Courses

1 Math, Science
Student_ID Courses

2 English, History

This table is not in 1NF because the "Courses" column contains multiple values.
To convert it to 1NF, each course should be stored in a separate row:

Student_ID Course

1 Math

1 Science

2 English

2 History

Second Normal Form (2NF)

A table is in Second Normal Form (2NF) if:

• It is in 1NF.
• There is no partial dependency, i.e., all non-key attributes are fully
functionally dependent on the entire primary key (for tables with
composite keys).

In simple terms, 2NF eliminates partial dependencies, where a non-key column


depends on only part of a composite primary key.

Example: Consider the table:

Student_ID Course Instructor Instructor_Phone

1 Math Mr. Smith 1234567890

1 Science Mrs. Lee 0987654321

2 English Mr. Brown 1122334455

Here, Instructor_Phone depends only on Instructor, not the entire primary key
(Student_ID and Course). To convert it to 2NF, you split the table:
Students Table:

Student_ID Course Instructor

1 Math Mr. Smith

1 Science Mrs. Lee

2 English Mr. Brown

Instructors Table:

Instructor Instructor_Phone

Mr. Smith 1234567890

Mrs. Lee 0987654321

Mr. Brown 1122334455

Third Normal Form (3NF)

A table is in Third Normal Form (3NF) if:

• It is in 2NF.
• There is no transitive dependency, i.e., non-key attributes do not
depend on other non-key attributes.

In other words, each non-key attribute should depend only on the primary key
and not on any other non-key attribute.

Example: Consider the table:

Employee_ID Department Department_Manager Manager_Phone

1 HR Mr. Green 2345678901

2 IT Mrs. Brown 3456789012


Here, Manager_Phone depends on Department_Manager, not directly on the
primary key (Employee_ID). To convert this to 3NF, we remove the transitive
dependency:

Employees Table:

Employee_ID Department Department_Manager

1 HR Mr. Green

2 IT Mrs. Brown

Managers Table:

Department_Manager Manager_Phone

Mr. Green 2345678901

Mrs. Brown 3456789012

Dependency Preservation

Dependency Preservation refers to the property of a database schema where


all functional dependencies can still be enforced in the decomposed tables
without requiring joins. It ensures that after normalization, you can still
enforce the original dependencies directly within individual tables.

Boyce-Codd Normal Form (BCNF)

A table is in Boyce-Codd Normal Form (BCNF) if:

• It is in 3NF.
• For every functional dependency A→BA \rightarrow BA→B, AAA must
be a superkey.

In simpler terms, BCNF eliminates any situation where a non-superkey


attribute determines another attribute.

Example: Consider the table:


Student_ID Course Instructor

1 Math Mr. Smith

2 History Mrs. Lee

If we assume that a student can only enroll in one course (i.e., Student_ID
determines Course), then the table is not in BCNF because Student_ID is not a
superkey for the dependency Student_ID → Course. To bring this table into
BCNF, you would need to decompose the table into two:

Students Table:

Student_ID Instructor

1 Mr. Smith

2 Mrs. Lee

Courses Table:

Course Instructor

Math Mr. Smith

History Mrs. Lee

Multi-valued Dependencies and Fourth Normal Form (4NF)

A table is in Fourth Normal Form (4NF) if:

• It is in BCNF.
• It has no multi-valued dependencies.

A multi-valued dependency occurs when one attribute determines a set of


values for another attribute, and these values do not depend on each other.

Example: Consider the table:


Student_ID Course Language

1 Math English

1 Science French

Here, the Student_ID determines both Course and Language, but these two are
independent of each other. To bring this into 4NF, we split the table into two:

Courses Table:

Student_ID Course

1 Math

1 Science

Languages Table:

Student_ID Language

1 English

1 French

Join Dependencies and Fifth Normal Form (5NF)

A table is in Fifth Normal Form (5NF) if:

• It is in 4NF.
• It does not contain any join dependencies.

A join dependency occurs when a table can be reconstructed by joining


multiple smaller tables, but doing so would cause redundancy in the data. In
5NF, all such join dependencies must be eliminated.

Example: Consider the table:


Supplier_ID Part_ID Project_ID

1 A X

1 B X

2 A Y

In this case, the table can be split into three smaller tables to eliminate
redundancy:

Suppliers Table:

Supplier_ID Part_ID

1 A

1 B

2 A

Parts Table:

Part_ID Project_ID

A X

B X

A Y

Projects Table:

Project_ID Supplier_ID

X 1

X 2
Domain-Key Normal Form (DKNF)

Domain-Key Normal Form (DKNF) is the highest normal form, where:

• The schema is free from all types of anomalies (functional, multivalued,


and join dependencies).
• All constraints are based on the domain of the data and the key
relationships.

In DKNF, the design ensures that there are no exceptions to the rules, and data
integrity is fully guaranteed based on domain and key constraints.

Summary of Normal Forms:

1. 1NF: Eliminates repeating groups; ensures atomicity.


2. 2NF: Removes partial dependencies (in tables with composite keys).
3. 3NF: Removes transitive dependencies.
4. BCNF: Ensures that every functional dependency has a superkey as its
determinant.
5. 4NF: Eliminates multi-valued dependencies.
6. 5NF: Eliminates join dependencies.
7. DKNF: Fully normalized without any exceptions or anomalies.

These normalization forms aim to organize data efficiently, reduce


redundancy, and maintain data integrity. Each successive form builds upon the
previous one to address more complex issues in database design.
UNIT-IV

Database Recovery

Database Recovery refers to the process of restoring a database to a correct


state after a failure, such as a system crash, power failure, or data corruption.
Recovery ensures that no data is lost, and the database remains consistent.

Key methods of database recovery:

• Backup and Restore: Regular backups of the database are made and
stored separately. In case of a failure, the backup is used to restore the
data.
• Transaction Logs: Databases maintain logs of all changes made to the
data. If a crash occurs, the system can use the transaction logs to roll
back or reapply changes to bring the database back to a consistent state.
• Checkpointing: A process that creates a consistent snapshot of the
database. This helps to quickly restore to a known good state.

Concurrency Management

Concurrency Management refers to handling multiple users or applications


accessing the database at the same time, without causing inconsistencies.
When several users modify the database simultaneously, concurrency control
ensures that data remains consistent and avoids conflicts.

Key concepts in concurrency management:

• Locking: The database uses locks to prevent two transactions from


modifying the same data simultaneously. There are different types of
locks (shared, exclusive) to control access to the data.
• Transaction Isolation: This ensures that transactions are executed in
isolation from each other. It avoids problems like dirty reads (reading
uncommitted data) and ensures the database remains consistent.
• Deadlock Prevention: This happens when two transactions are waiting
for each other to release resources. The system needs to detect and
resolve deadlocks by canceling one of the transactions.
Database Security

Database Security involves protecting data from unauthorized access and


ensuring its confidentiality, integrity, and availability. Security measures are
applied to prevent unauthorized users from accessing or tampering with the
data.

Key aspects of database security:

• Authentication: Ensures only authorized users can access the database,


often via usernames and passwords.
• Authorization: Grants users specific rights or privileges, such as read,
write, or modify access to data.
• Encryption: Data is encrypted to ensure it is protected from
unauthorized access when stored or transmitted.
• Auditing: Tracks database activities to detect any suspicious or
unauthorized actions.

Database Integrity

Database Integrity ensures the accuracy and consistency of data in the


database. It involves enforcing rules that maintain the correctness of data over
time.

Types of database integrity:

• Entity Integrity: Ensures that each row in a table is uniquely identifiable.


This is usually done by assigning a primary key to each record.
• Referential Integrity: Ensures that relationships between tables are
consistent. For example, if a record in one table refers to a record in
another table (using foreign keys), the referenced record must exist.
• Domain Integrity: Ensures that the values in a column conform to a
specific data type, range, or format.
Database Control

Database Control refers to the management of database operations to ensure


they run efficiently, securely, and reliably. This includes transaction
management, integrity checks, and user access control.

Key components:

• Transaction Management: Ensures that transactions (a set of database


operations) are executed in a way that maintains data integrity, and
either fully complete (commit) or do not alter the data (rollback) in case
of failure.
• Locking and Isolation: Ensures that multiple transactions can run
simultaneously without conflicting with each other.

Structure of a Distributed Database

A Distributed Database is a collection of data stored across multiple physical


locations but managed as a single database. The data is distributed over
several sites, each of which can access and modify the data independently.

Key features of a distributed database:

• Data Distribution: Data is divided and stored across multiple servers or


locations. It could be partitioned horizontally (rows) or vertically
(columns).
• Replication: Copies of data are maintained across different locations for
availability and fault tolerance.
• Autonomy: Each site in the distributed database system may operate
independently, with its own local database.

Design of Distributed Databases

Designing a Distributed Database involves deciding how to distribute and


manage data across multiple sites. The goal is to ensure that the system is
efficient, fault-tolerant, and able to provide high availability.

Key aspects of distributed database design:


• Fragmentation: Deciding how to divide data into smaller pieces
(fragments) and how to distribute them across different sites.
Fragmentation can be horizontal (data rows) or vertical (data columns).
• Replication: Determining how data should be copied across multiple
sites to improve availability and fault tolerance.
• Location Transparency: Users should not need to know the exact
location of data. The system should handle data retrieval automatically.
• Consistency and Synchronization: Ensuring that all copies of the data
are updated consistently and that users always see the most current
version of the data.
• Concurrency Control: Managing simultaneous access to data by multiple
users, ensuring there are no conflicts when data is modified across
different sites.

Summary

1. Database Recovery: Ensures the database can be restored after failures.


2. Concurrency Management: Manages simultaneous database access by
multiple users.
3. Database Security: Protects data from unauthorized access.
4. Database Integrity: Maintains the accuracy and consistency of data.
5. Database Control: Manages database operations to ensure reliability
and security.
6. Distributed Database: A database distributed across multiple locations.
7. Design of Distributed Databases: Planning how to distribute, replicate,
and manage data efficiently across multiple sites.

These concepts ensure that databases are reliable, secure, and efficient,
whether it's a centralized system or a distributed one across different
locations.

<<<<BAD RISHU>>>>>>

You might also like