The document outlines various units related to relational database design and management, focusing on PostgreSQL and its procedural language PL/pgSQL. It covers essential concepts such as transaction management, concurrency control, crash recovery, database security, and system architectures. Each unit provides detailed insights into principles, methods, and mechanisms to ensure data integrity, consistency, and security in database systems.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0 ratings0% found this document useful (0 votes)
4 views27 pages
Adv Dbms
The document outlines various units related to relational database design and management, focusing on PostgreSQL and its procedural language PL/pgSQL. It covers essential concepts such as transaction management, concurrency control, crash recovery, database security, and system architectures. Each unit provides detailed insights into principles, methods, and mechanisms to ensure data integrity, consistency, and security in database systems.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 27
Unit-l: Relational Database
Design & PL/PostgreSQL
This unit focuses on the structured way
data is organized and managed,
particularly within the PostgreSQL
database system using its procedural
language.
e Relational Database Design: This
involves organizing data into tables
(relations) with rows and columns,
establishing relationships between
different entities using keys (primary
and foreign keys) to maintain data
integrity and avoid redundancy. Key
principles include clear table
definitions, logical relationships, and
well-defined purposes for tables
(Source 1.1).PL/PostgreSQL (PL/pgSQL): This is
PostgreSQLs native procedural
language, extending SQLs functionality.
It allows for procedural programming
logic like:
Language structure: Defining blocks of
code for functions, procedures, and
triggers (Source 1.2, 1.3).
Controlling the program flow,
conditional statements (IF, ELSIF,
ELSE), loops (LOOP, FOR, WHILE):
These constructs allow for complex
logic and iteration within database
operations (Source 1.3).
Views: Virtual tables based on the
result-set of a SQL query, used to
simplify complex queries and restrictdata access.
Functions and Procedures: Reusable
code blocks encapsulated for specific
tasks. Functions typically return a
value, while procedures are designed
for side effects (modifying data) and
don't necessarily return a value (Source
oye
Handling errors and exceptions:
Mechanisms
(BEGIN...EXCEPTION...END) to
gracefully manage runtime errors and
ensure data integrity (Source 1.3).
Cursors: Used to process a result set
one row at a time, often within loops,
for operations that cannot be achieved
with a single SQL query (Source 1.3).e Triggers: Special stored procedures
that automatically execute when a
specific event (like INSERT, UPDATE,
DELETE) occurs on a table (Source 1.2,
les):
e Packages: (More common in Oracle PL/
SQL, less explicitly defined as a distinct
feature in PostgreSQL PL/pgSQL in the
same way, but related concepts are
handled through schema organization
of functions/procedures). In general,
packages group related functions,
procedures, variables, and other PL/
SQL constructs.
Unit-Il: Transaction Concepts
This unit covers the fundamental
principles that ensure data integrity and
reliability in a multi-user databaseenvironment.
e Transaction: A logical unit of work
performed against a database, treated
as a single, indivisible operation. It
comprises one or more data
manipulation statements (e.g.,
CREATE, READ, UPDATE, DELETE) and
queries (Source 2.1, 2.2).
e Properties of Transaction (ACID
Properties): These are crucial for
maintaining data integrity:
e Atomicity: All operations within a
transaction either complete
successfully or none of them do (“all or
nothing") (Source 2.2, 2.3).
e Consistency: A transaction brings the
database from one valid consistentstate to another, ensuring all integrity
rules are maintained (Source 2.2, 2.3).
Isolation: Concurrent transactions
execute independently without
interfering with each other. The effects
of an incomplete transaction are not
visible to other transactions (Source
(fd, Ds).
Durability: Once a transaction is
committed, its changes are
permanently stored and survive system
failures (Source 2.2, 2.3).
States of Transactions: A transaction
progresses through various states:
Active: The initial state, where the
transaction is being executed.Partially Committed: After the final
operation is executed.
Committed: All operations are
successful, and changes are
permanent.
Failed: An error occurred, and the
transaction cannot proceed.
Aborted: The transaction is rolled back,
undoing all its changes (Source 2.2).
Concurrent Execution of Transactions
and Conflicting Operations: When
multiple transactions run
simultaneously, they might try to
access or modify the same data,
leading to conflicts (e.g., lost updates,
dirty reads, non-repeatable reads).e Schedules, Types of Schedules: A
schedule represents the chronological
order of operations from various
interleaved transactions. Different
types (e.g., serial, non-serial) describe
how transactions' operations are
interleaved.
¢ Concept of Serializability: The goal of
concurrency control. A concurrent
schedule is serializable if its effect is
equivalent to some serial schedule
(where transactions execute one after
another without interleaving), ensuring
correctness despite concurrency
(Source 2.3, 3.2).
e Precedence Graph for Serializability: A
directed graph used to check for
serializability. If the graph contains no
cycles, the schedule is serializable(Source 2.3).
Unit-Ill: Concurrency Control
This unit explores methods to manage
simultaneous transactions to ensure data
consistency and integrity.
e Ensuring Serializability by Locks: Locks
are mechanisms to control access to
data items, ensuring mutual exclusion.
e Different Lock Modes:
e Shared Lock (S-lock/Read Lock):
Allows multiple transactions to read
the same data concurrently but
prevents any writes.
e Exclusive Lock (X-lock/Write Lock):
Allows only one transaction to read andwrite a data item, preventing any other
access (read or write) (Source 3.3, 3.4).
Two-Phase Locking (2PL) and its
Variations: A protocol that ensures
serializability by operating in two
phases:
Growing Phase: A transaction can
acquire new locks but cannot release
any.
Shrinking Phase: A transaction can
release locks but cannot acquire any
new ones (Source 3.3, 3.4).
Variations like Strict 2PL ensure even
stronger isolation by holding all
exclusive locks until commit (Source
2.3).e Multiple Granularity Locking Protocol:
Allows locking at different levels of
data granularity (e.g., entire database,
table, page, record). This optimizes
concurrency by allowing more specific
locking (e.g., locking only a single row
instead of the entire table).
e Basic Timestamp Method for
Concurrency: Assigns a unique
timestamp to each transaction. The
system processes operations based on
the timestamps to maintain
consistency, rejecting operations that
violate timestamp order (Source 3.2,
3.3).
e Thomas Write Rule: A modification to
the timestamp ordering protocol that
allows certain "obsolete" write
operations to be ignored, potentiallyincreasing concurrency.
e Dynamic Database Concurrency
(Phantom Problem): Occurs when a
transaction reads a set of rows,
another transaction inserts new rows
that meet the criteria of the first
transaction's read, and then the first
transaction re-reads and sees
"phantom" new rows. Concurrency
control mechanisms like stricter
isolation levels (e.g., Serializable) or
specific locking prevent this.
e Timestamps versus Locking: Two
primary approaches to concurrency
control, each with pros and cons
regarding implementation complexity,
performance, and deadlock potential.
¢ Optimistic Concurrency ControlAlgorithm: Assumes conflicts are rare.
Transactions execute without locks,
validate their changes at commit time
to check for conflicts, and roll back if a
conflict is detected. This can offer
higher concurrency but risks aborting
transactions (Source 3.1).
Multi-version Concurrency Control
(MVCC): Allows multiple versions of a
data item to exist simultaneously.
When a transaction reads data, it sees
a consistent snapshot of the database,
often the version that was current
when the transaction started. This
avoids read-write conflicts and
deadlocks by allowing readers not to
block writers and writers not to block
readers (Source 3.2, 3.4). PostgreSQL
uses MVCC extensively.e Deadlock Handling Methods:
e Detection and Recovery (Wait-for
Graph): Allows deadlocks to occur, then
detects them using a wait-for graph
(nodes are transactions, edges indicate
"waits for"). Once detected, one or
more transactions are chosen as
victims and aborted/rolled back to
break the deadlock (Source 3.4).
e Prevention Algorithms (Wound-wait,
Wait-die): Strategies to ensure
deadlocks never occur by imposing
rules on how transactions acquire
locks.
e Wound-Wait: Older transactions
"wound" (force rollback) younger
transactions that hold needed
resources.e Wait-Die: Older transactions wait for
younger transactions; younger
transactions "die" (rollback) if they
request a resource held by an older
transaction.
Unit-IV: Crash Recovery
This unit focuses on mechanisms to
restore the database to a consistent state
after a system failure.
e Transaction Failure Classification:
Failures can be due to:
e Transaction Errors: Logical errors in the
transaction.
e System Errors: Software errors,
operating system failures.e Hardware Failures: Disk crashes, power
failures (Source 4.1).
e Catastrophic Failures: Loss of entire
storage.
e Recovery Concepts: The goal is to
bring the database back to a consistent
state after a crash by undoing
incomplete transactions and redoing
committed transactions that might not
have been fully written to disk (Source
4.1, 4.2, 4.3).
e Checkpoints: Periodic operations
where the DBMS flushes all modified in-
memory data pages to disk and
records a special "checkpoint" entry in
the log. This reduces the amount of log
data that needs to be replayed duringrecovery (Source 4.3, 4.4).
Recovery with Concurrent Transactions
(Rollback, Checkpoints, Commit):
Recovery procedures must account for
transactions that were active or
partially committed at the time of the
crash, using logs and checkpoints to
determine what to rollback (undo
uncommitted changes) and what to
commit/redo (ensure committed
changes are persistent).
Log-Based Recovery Techniques
(Deferred and Immediate Update):
Log: A transaction log (or redo log/
write-ahead log - WAL) is a sequential
record of all database modifications. It
contains sufficient information to undo
or redo any transaction (Source 4.3,4.4).
Deferred Update (No-Undo/Redo): All
changes are temporarily stored in local
buffers and written to the database
only after the transaction commits. If a
crash occurs before commit, no undo
is needed. Redo is required for
committed transactions.
Immediate Update (Undo/Redo):
Changes are written to the database
disk as they occur during the
transaction. Requires both undo (for
aborted/uncommitted transactions)
and redo (for committed transactions
that might not have been fully written
to disk) (Source 4.4).
Buffer Management: The DBMS uses a
buffer pool (memory cache) to holddata pages. Recovery interacts with
how data is moved between buffers
and disk.
e Shadow Paging: A recovery technique
that maintains two page tables (current
and shadow). Updates are applied to a
new set of pages, and only after
commit is the shadow page table
updated to become the current one,
making the changes visible. This
simplifies recovery as no undo/redo is
needed for committed transactions.
Unit-V: Database Security
This unit covers the measures and
controls used to protect the database from
unauthorized access, misuse, and data
corruption.Introduction to Database Security
Concepts: Focuses on preserving the
Confidentiality, Integrity, and Availability
(CIA triad) of data (Source 5.1, 5.2). It
involves protecting the data, the DBMS,
associated applications, and the
underlying infrastructure.
Methods for Database Security:
Authentication: Verifying the identity of
users accessing the database (Source
5.2).
Authorization: Granting specific
privileges or permissions to
authenticated users, determining what
data they can access and what
operations they can perform (Source
5.2).Auditing: Recording all activities
(logins, data access, modifications) to
track usage and detect suspicious
behavior (Source 5.1, 5.2).
Encryption: Protecting data
confidentiality by transforming it into
an unreadable format, both at rest
(stored on disk) and in transit (during
transmission) (Source 5.1, 5.2).
Backup Security: Ensuring that
backups are as secure as the live
database.
Discretionary Access Control (DAC)
Method: The owner of a data object
can grant and revoke access privileges
to other users. It's "discretionary"
because the owner has discretion over
who accesses their objects.e Mandatory Access Control (MAC) and
Role-Based Access Control (RBAC) for
Multilevel Security:
e Mandatory Access Control (MAC): A
more rigid access control model where
access decisions are based on security
labels (sensitivity levels of data and
clearance levels of users). Access is
granted only if the user's clearance
matches or exceeds the data's
sensitivity. Often used in highly secure
environments.
e Role-Based Access Control (RBAC):
Users are assigned roles, and
permissions are granted to roles. Users
inherit the permissions of the roles
they are assigned. This simplifies
security administration, especially inlarge organizations (Source 5.2).
Multilevel Security: Designing systems
to handle data with different sensitivity
levels and users with different
clearance levels simultaneously,
ensuring that users can only access
data for which they have appropriate
clearance.
Overview of Encryption Technique for
Security: Discusses symmetric and
asymmetric encryption, hashing, and
their application in protecting sensitive
data in databases.
Statistical Database Security: Deals
with protecting individual privacy while
allowing statistical queries on a
database. Aims to prevent inferring
sensitive individual data fromaggregate statistics.
Unit-VI: Database System
Architectures
This unit explores how database systems
are structured and deployed, from single-
server setups to complex distributed
environments.
e Centralized and Client-Server
Architectures:
e Centralized Architecture: All database
components (data, DBMS, application
logic, user interface) reside on a single
machine. Common for small-scale
applications or development (Source
O72, @.8)).e Client-Server Architecture (2-Tier): The
database resides on a server, and client
applications on separate machines
interact with it. The client handles the
user interface and some application
logic, while the server handles data
management. This improves scalability
and resource utilization (Source 6.1,
G7),
e 3-Tier Architecture: Adds a middle
(application) tier between the client
and the database server. The client
(presentation tier) interacts with the
application tier, which then
communicates with the database tier.
This provides better scalability,
security, and maintainability by
decoupling concerns (Source 6.1, 6.2).
e Server System Architectures: Refers tothe internal structure of the database
server, including components like query
processor, storage manager,
transaction manager, etc. (Source 6.3).
Introduction to Parallel Systems:
Database systems designed to utilize
multiple CPUs or I/O devices to
execute operations concurrently,
improving performance for large
datasets and complex queries. They
can be shared-memory, shared-disk, or
shared-nothing architectures.
Introduction to Distributed Systems:
Database systems where data is stored
across multiple interconnected
computers (nodes) in a network. This
offers advantages like increased
availability, scalability, and local
autonomy. Data can be fragmented andreplicated across nodes.
Introduction to Object-Based
Databases: Database systems that
model data as objects, similar to object-
oriented programming. They allow for
complex data types, inheritance, and
encapsulation, going beyond the
traditional relational model.