0% found this document useful (0 votes)
7 views22 pages

Dbms Short Notes Compressed

The document discusses database design concepts, including limitations of file systems, relational model terminology, functional dependencies, normalization, and entity-relationship models. It outlines various normal forms (1NF, 2NF, 3NF, BCNF) and their significance in eliminating anomalies and ensuring data integrity. Additionally, it covers relational algebra operations and their applications in SQL, emphasizing the importance of keys and relationships in database design.

Uploaded by

gf1166679
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views22 pages

Dbms Short Notes Compressed

The document discusses database design concepts, including limitations of file systems, relational model terminology, functional dependencies, normalization, and entity-relationship models. It outlines various normal forms (1NF, 2NF, 3NF, BCNF) and their significance in eliminating anomalies and ensuring data integrity. Additionally, it covers relational algebra operations and their applications in SQL, emphasizing the importance of keys and relationships in database design.

Uploaded by

gf1166679
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

DBMS

GATE फर्रे

Module 1. Database design

Limitations of File System (Short Notes)


No Data Abstraction: Users must manage physical
data access manually (no data independence).
Not Scalable: Suitable only for small datasets;
inefficient for large databases.
No Concurrency Control: Cannot handle multiple
users or concurrent transactions safely.
Single-User Access: Lacks multi-user support; no in- 1. Trivial FD : Always valid, X → Y is trivial iff X ⊇ Y.
built locking or transaction management. Ex AB → A
Data Redundancy & Inconsistency: No central 2. Non Trivial FD: X → Y is non trivial FD if X ⋂ Y = Φ
control over data integrity or duplication. and X → Y must satisfy FD definition. Ex A → B

Relational Model Terminology Armstrong's Axioms / Inference Rules

Tuple : A tuple is a single row in a relation (table), Armstrong’s Axioms (or inference rules) provide a
representing a single record in that table. sound and complete set of rules for reasoning about
functional dependencies (FDs) in relational database
Attribute : An attribute refers to a column in a relation. design.
It defines the properties or characteristics of the
relation. Let X, Y, Z, W be sets of attributes. The following three
fundamental axioms form the basis of all functional
Domain : The domain of an attribute is the set of all dependency inference:
possible values that an attribute can take. It defines the
data type and constraints of the attribute. Rule Name Rule

Degree : The degree of a relation is the total number of Reflexivity If X⊇YX, then X→Y

attributes (columns) it contains. Augmentation If X→Y, then XZ→YZ

Cardinality : The cardinality of a relation is the total Transitivity If X→Y and Y→Z, then X→Z

number of tuples (rows) present in it. Decomposition If X→YZ, then X→Y and X→Z

Relational Instance : A relation instance is a specific set Union If X→Y, and X→Z, then X→YZ

of tuples that exist in a relation at a particular moment. Pseudotransitive If X→ Y, and WY → Z, then WX → Z


It is a finite set that changes over time.

Attribute closure [X]+ Let x be the attribute set of


Relation Schema: A relation schema describes the
relation R set of all possible attributes which are
structure of a relation, including its relation and the
logically or functionally determined by X is called
names and data types of its attributes.
attribute closure of X [X]+

Functional dependencies & Normalization


X —-> Y or X determines Y

Page No:- 01
DBMS
GATE फर्रे

Key Concept Finding Multiple candidate Key : First find any one
candidate key in relation R, the attribute present in
Primary key: Randomly chosen candidate key is the that candidate jey is key/prime attribute.
primary key which is Unique and Not Null.
If X → {Prime/Key Attribute}, then multiple
Attribute

Alternate Key: All other keys are alternate keys from candidate keys are possible.
the set of candidate keys, except the primary key.
Relation F⊃G G⊃F F=G Uncomaparabl
e
Super-key: If all attributes of relation R is determind
by attribute (attribute set) closure of X [X]+ . The F Covers G True False True False

superset of the candidate key is known as the super


G Covers F False True True False
key in a relation. Then X is super key.

Example: consider Emp-id as a candidate key in a Minimal cover: To eliminate redundant FD.
relation Employee.
Then, {Emp-id, Emp-name} is a super-key. Minimal / Canonical cover: Sets of FD may have
redundant dependencies that can be inferred from
Candidate key: The minimal superkey is defined as others.
the candidate key.
R(ABCDE) [AB → C, C → D, B → EA] Procedure to find canonical cover:
[AB]+ = [ABCDE], therefore AB is super key, 1. Split the FD such that RHS contain single attribute
[A]+ = [A] Ex. A → BC : A → B, A → C
[B]+ = [ABCDE], therefore B is the candidate key. 2. Find the redundant attribute on LHS and delete
them.
Key / Prime attribute: Set of attributes which belongs 3. Find the redundant FD and delete them from the
in any candidate key. FD set. { A → B, B → C, A → C } : { A → B, B → C }

Non Key / Non Prime attribute: Set of attributes Note: Minimal cover may or may not be unique i.e. we
which does not belong in any candidate key. can have more than one.
Key/prime attribute = [B]
Non key/ Non Prime attribute = [A, C, D, E] Finding number of super key: Let R be the relational
Note: To find the candidate key, first find the super schema with n attributes A1, A2, ….An
key, then check the minimum of that super key. Total number of super keys = 2n-1
Ex. With only CK A1, A2
Membership set Maximum number of candidate key = n C ⌊n/2⌋ , Where n
Let F be a given set of functional dependencies. is number of attributes.
A functional dependency X→Y is said to be a member
of F+ (i.e., logically implied by F) iff
Normalization
X→Y ∈ F+ ⟺ Y ⊆ X+
Where X+ is the attribute closure of X with respect to
Why Normalisation?
the set F.
To eliminate the following anomalies.
Equality between 2FD set: Insertion Anomalies
Let two FD set F & G give Updation Anomalies
Deletion Anomalies

Properties of Decomposition

Page No:- 02
DBMS
GATE फर्रे

Lossless decomposition: Third Normal Form (3NF)


i) If (R₁ ⋈ R₂ ⋈ R₃ ⋈ ... ⋈ Rₙ) = R then it is lossless • A relation is in Third Normal Form (3NF) if:
join decomposition
ii) If (R₁ ⋈ R₂ ⋈ ... ⋈ Rₙ) ⊃ R then it is lossy. 1. It is in Second Normal Form (2NF), and

Dependency preserving decomposition: 2. For every non-trivial functional dependency X→YX


i) If {F₁ ∪ F₂ ∪ F₃ ... ∪ Fₙ} = F, then decomposition is \rightarrow YX→Y, at least one of the following
dependency preserving. holds:
ii) If {F₁ ∪ F₂ ∪ F₃ ... ∪ Fₙ} ⊂ F, then it is not Either X is a superkey, or
dependency preserving. Y is a prime attribute (i.e., part of some candidate
key).
Normal Forms
Non-trivial FD: X→Y is non-trivial if X ∩ Y = Φ &
First Normal Form (1NF) must satisfy FD definitions
● A relation is in First Normal Form (1NF) if it does
not contain any multivalued or composite Why 3NF?
attributes. Eliminates transitive dependencies.
Allows some redundancy to preserve dependency
● All attributes must contain atomic (indivisible) preservation and lossless decomposition.
values.
Boyce-Codd Normal Form (BCNF)
● By default, RDBMS relations are in 1NF. ● A relation is in BCNF if:
For every non-trivial functional dependency
Example: X→Y
If an attribute contains a list like {Math, → X must be a superkey.
English}, it's not in 1NF.
Instead, split into multiple rows or use This is stricter than 3NF because it does not
separate records. allow any dependency where the
determinant is not a superkey, even if Y is a
Second Normal Form (2NF) prime attribute.
The second normal form (2NF) is based on the
concept of full functional dependency. A functional Quick Note:
dependency X → Y is a full functional dependency if ● If a relation has only two attributes, it is
removal of any attribute A from X means that the always in BCNF regardless of the dependency.
dependency does not hold any more; that is, for any
attribute A ∈ X, (X − {A}) does not functionally Design goal 1N 2 3 BCNF
determine Y. A functional dependency X → Y is a F NF NF
partial dependency if some attribute A ∈ X can be
0% No No No YES(Suffers from multivalued
removed from X and the dependency still holds; that is, Redundancy attributes)
for some A ∈ X, (X − {A}) → Y. In Figure 15.3(b), {Ssn,
Lossless Join yes yes yes yes
Pnumber} → Hours is a full dependency (neither Ssn
→ Hours nor Pnumber → Hours holds). However, the Dependency yes yes yes May or may not be preserved
dependency {Ssn, Pnumber} → Ename is partial Preservation
because Ssn → Ename holds.
Definition. A relation schema R is in 2NF if every
nonprime attribute A in R is fully functionally
dependent on the primary key of R.

Page No:- 03
DBMS
GATE फर्रे

Note:
2NF 3NF Checking BCNF Checking
Checking

[Proper R is in 3NF if every non- R is in BCNF if every X →


Subset] of trivial FD must satisfy the A non-trivial FD must
Ck. → [non following: satisfy the following
key] Either x: Super Key , or Condition:
[attribute] y: key/prime attribute X: Super Key.
∴ Not in
2NF.

Page No:- 04
DBMS
GATE फर्रे

Module: 2 ER Model

A description of data in terms of a data model is called


schema.
Entity relationship set: It is used to relate two or
more entity sets.
Entity: Real-world object (e.g., Student).
Strong Entity: Exists independently; has a
primary key.
Weak Entity: Depends on strong entity; no
primary key.

Attributes: Describe properties of an entity.


Simple (Atomic), Composite, Derived,
Multivalued.

Relationship: Association among entities.


Degree of Relationship Set: Specifies the
number of Entity Set participate in a relationship set. ER RDBMS
Mapping Cardinality: 1:1, 1:N, M:1, M:N. One : One Merge relationships set any one side. 2
table required
Participation: 1:M Merge relationship set towards many
Total (Double line): Every entity must sides. 2 table required
participate. M:M Separate table for each entity set and
Partial (Single line): Optional participation. relationship set. 3 table required
M:1 Merge relationship set towards many
side. 2 table required
ER to Relational Mapping
Full participation on “one” side of many to one
Strong Entity → Relation with primary key. relationship
Weak Entity → Include partial key + key Merge the entities and relationships set into a
attribute of strong(Owner) Entity set single relational table. So, 1 table.
1:1 Relationship → Foreign key in either
entity. Full participation on “Many” side of Many-to-one
1:N Relationship → Foreign key in "N" side. relationship
M:N Relationship → Create separate relations Merge relationship set towards many sides. So,
with foreign keys. 2 relational tables.
Multivalued Attribute → Create new relation. Full participation on any “one” side in one-to-one
relationship
Aggregation → Treat relationship set as Merge the entity sets and relationship sets into
entity. a single table. So, 1 table.
Keys:
Primary Key: Unique identifier for entity. Full participation on any “Many” side of Many-to-
Discriminator/Partial Key: Used in weak entity sets. Many relationship
Merge relationship set towards any “Many”
Partial participation on both side of binary relationship side of relationship. So, 2 table.

Page No:- 01
DBMS
GATE फर्रे

Module - 3 Relational Model and SQL


Outer Joins:
Derived Operator (a) LEFT OUTER JOIN
1. JOIN & its type R ⟕ S : It produces
2. Intersection (R ⋈ S) {Records of R those are failed
3. Division join condition with remaining attributes null}
(b) RIGHT OUTER JOIN (⟖)
Relational Algebra (RA) R ⟖S : It produces]
Basic Operators: (R ⋈ S) {Records of S those are failed
Selection (σ), Projection (π), Rename (ρ) join condition with remaining attributes null}
Set Operations: (C) FULL OUTER JOIN (⟗)
Union (∪), Intersection (∩), Difference R ⟗ S = (R ⟕ S) (R ⟖ S)
(−)
Cartesian Product (×) Rename(ρ): It is used to rename table names and
Joins: attribute names for query processing.
Natural Join (⨝), Theta Join, Outer Join Example:
(Left, Right, Full) (I) Stud (Sid, Sname, age)
ρ(Temp, Stud) : Temp (Sid, Sname, age)
πattribute_name (R): It is used to project required
attributes from relation R. Division: The division operation is used in relational
algebra to find tuples in one relation that are
σcondition(p) (R): It is used to select records from associated with all tuples in another relation.
relation R, those satisfied by the condition (P). Given two relations: R(A, B), S(B)
The operation R ÷ S returns those values of A
Cross product (×): R × S - It results in all attributes of such that for every value of B in S, the pair (A,
R followed by all attributes of S, and each record of R B) is present in R.
paired with every record of S. Expansion of Division Using Basic Operators:
1. Compute the cross product of all student IDs
Degree (R × S) = Degree (R) + Degree (S) with all course IDs: πsid(Enroll)×πcid(Course)
This gives all possible enrollments that
Note: Relation R with n tuples and Relation S with 0 should exist if each student were enrolled in
tuples then number of tuples in R × S = 0 tuples every course.
2. Subtract the actual Enroll relation:
Join (⋈): (πsid(Enroll)×πcid(Course))−Enroll
1. Natural join (⋈) This yields the set of (sid, cid) pairs that
R⋈S distinct attributes(equality between do not exist in the Enroll relation.
common attributes of R and S (R × S)) 3. Project the student IDs from the result of Step
deg(R ⋈ S) = deg(R) + deg(S) - number of 2: πsid((πsid(Enroll)×πcid(Course))−Enroll)
common attribute. These are the students who are not
enrolled in every course
Note: Natural join is equal to cross-product if the join 4. Subtract this result from the set of all student
condition is empty. IDs:
Number of n-ary relation that can be formed over n- πsid(Enroll)−πsid((πsid(Enroll)×πcid(Course)
domains having n elements : nn )−Enroll)
Number of equivalent relation This final expression gives the students
Conditional Join (⋈c) who are enrolled in every course.
R ⋈c S ≅ σc (R × S)

Page No:- 06
DBMS
GATE फर्रे

Result : πsid(Enroll)−πsid((πsid(Enroll)×πcid INSERT: Adds new rows into a table.


(Course))−Enroll) UPDATE: Modifies existing rows in a table.
DELETE: Removes rows from a table.
Set Operator:
Union: DCL (Data Control Language): Used to control
R and S relations union compatible iff- access and permissions to the database.
(i) Arity of R equal to Arity of S and GRANT: Gives privileges to users.
(ii) Domain of attributes of R must be the same as REVOKE: Withdraws previously granted
domain of attributes of s respectively. privileges.

Structural Query language TCL (Transaction Control Language): Used to


manage transactions and control their execution.
COMMIT: Saves all changes made in the
current transaction.
ROLLBACK: Undoes all changes since the last
commit.
SAVEPOINT: Sets a point within a transaction
to which a rollback can be done.
Relational Query and SQL query
SQL : SELECT DISTINCT A1, A2, …… An
FROM R1, R2, …… Rn
WHERE P;
Equivalent RA: πA1, A2.....An (σ(R1 x R2 x ….Rn)

SQL
SELECT DISTINCT A1, A2, …… An
FROM R1, R2, …… Rn
WHERE condition
GROUP BY (attributes)
HAVING condition
ORDER BY (attributes) (DESC)
DDL (Data Definition Language): Used to define or Execution Flow
modify the structure of database schema objects FROM cross-product of relations
(like tables, views, indexes). WHERE Selection operator () to apply condition for
CREATE: Creates a new table, view, or other each record
object. GROUP BY
ALTER: Modifies an existing object (e.g., add a HAVING
column). SELECT
DROP: Deletes an existing object. DISTINCT ORDER BY
TRUNCATE: Removes all rows from a table
quickly (no rollback). GROUP BY:
• It is used to group records data based on specific
DML (Data Manipulation Language): Used to attributes.
manipulate data in existing tables. • It GROUP BY clause used then
SELECT: Retrieves data from one or more (a) Every attribute of the GROUP BY clause must be
tables. selected in SELECT clause.

Page No:- 07
DBMS
GATE फर्रे

(b) Not allowed to select any other attribute in the 6. Default: Specifies a default value when none is
SELECT clause. provided during insert
Allowed to select aggregate function along with
GROUP BY attribute in SELECT Clause. ALTER TABLE Statement
The ALTER TABLE command is used to modify the
HAVING Clause structure of an existing table.
• HAVING clause must be followed by GROUP BY Add a new column
clause. Drop (delete) a column
• HAVING clause used to select groups that are Modify column data type, rename columns, rename
satisfied having clause condition. table (DBMS-specific)
• HAVING clause condition must be over aggregation Aggregate Functions: Aggregate functions perform
function such as some ( ), Every ( ) etc. but not allowed calculations over a set of rows and return a single
direct attribute summary value.
comparison 1. Avg() - Mean value of numeric column
2. Count() - Returns the number of rows that
Note: Every Query which is written by using match a condition.
HAVING clause can be re-written by using WHERE 3. Sum() - Returns the total values
clause. 4. Min() - Returns the smallest value
5. Max() - Returns the largest value
Nested Query Without Co-relations: Inner query
independent of outer query. Nested Query Without Co-relations: It is
independent of the outer query.
SQL Constraints Inner Query → Runs independently and
Key Properties: returns a result.
Constraints can be defined: Outer Query → Uses the inner result to
During table creation (CREATE TABLE) complete execution.
After table creation using ALTER Example:
TABLE SELECT name
If any data violates a constraint, the database FROM employees
system rejects the operation (insert/update). WHERE dept_id IN (
1. Not Null: Ensures that a column must always SELECT id
hold a value FROM departments
2. Unique: Ensures all values are distinct in the WHERE location = “Delhi”
column or column set );
3. Primary Key: Combines NOT NULL + UNIQUE Co-related Nested Query
Uniquely identifies each record. Can consist of In Nested Co-related query inner query uses attributes
one or more columns (composite key). Each from outer query tables.
table can have only one PRIMARY KEY In Co-related Nested query inner query allowed in
4. Foreign Key: Maintains referential integrity WHERE, HAVING clause of outer query.
Enforces a relationship between columns in Example: SELECT A
two tables FROM R
The referencing column(s) must match values WHERE (SELECT count(*)
in the referenced table's PRIMARY or UNIQUE FROM S
key. Can be NULL if not marked NOT NULL WHERE S.B < R.A) < 5;
5. Check :Restricts the range or condition of NOTE: If co-relation in WHERE clause then inner query
values in a column re-computes for each record of outer query From

Page No:- 08
DBMS
GATE फर्रे

clause. If correlation in HAVING clause then inner


query re-computes for each group of outer query.
Function used for Nested Query
1. IN / Not IN:
2. ANY: Compares a value with each value
returned by the subquery. Used with
comparison operators: =, <, >, <=, >=, <>
3. ALL: Compares a value with all values returned
by the subquery.
4. EXIST / NOT EXISTS: Check whether a
subquery returns any rows. (Result empty or
non-empty)

SQL Join query


SELECT DISTINCT T1.eid
FROM Emp T1
JOIN Emp T2 ON T1.sal > T2.sal
WHERE T1.gen = 'Female' AND T2.gen = 'Male';
SQL Nested Query
SELECT eid
FROM Emp
WHERE gen = 'Female'
AND sal > ANY (
SELECT sal
FROM Emp
WHERE gen = 'Male'
);
SQL Co-related Nested Query
SELECT eid
FROM Emp T1
WHERE T1.gen = 'Female'
AND EXISTS (
SELECT *
FROM Emp T2
WHERE T2.gen = 'Male'
AND T1.sal > T2.sal
);

Page No:- 09
DBMS
GATE फर्रे

Module - 4 File Structure & Indexing (B and B+


Spanned organisation Unspanned organisation
tree) A record stored in more than 1 Record can be stored in a
block particular block
Blocking factor = Block size / Blocking factor = ⌊Block size /
Indexing Types
Record size Record size⌋
No memory wastage but block Block access cost reduced but
Block Factor of Index = ⌊ (B - H) / (k + P) ⌋ access cost increase wastage of memory
entries/block Every internal node except the root node contains at
where , B = Block size (in bytes) least (min) ⌈P/2⌉ block pointers (min keys ⌈P/2⌉ - 1)
H = Overhead per block (e.g., block headers, pointers, and maximum P block pointers (maximum keys P - 1).
etc.) Root can contain at least 2 block pointers (min 1 key
k = Number of fields per index entry in root node) and maximum P block pointers (max P -
P = Size of one field (in bytes), or size of one index 1 keys).
entry Keys within the node should be in ascending order.
Each leaf node should be at the same level.
1. Single-Level Index A B-tree maintains balance and prevents excessive
A single-level index contains a sorted list of index space waste due to deletion by enforcing specific
entries, each pointing to a block or record in the data constraints.
file. Internal Node structure <P1, <K1, Pr1>, P2, <K2,
(a) Primary Index (key + ordered file) Pr2>, ..., <Kq−1, Prq−1>, Pq>
(b) Clustered index (non key + ordered file) Pi is a tree pointer (points to a subtree).
(c) Secondary Index (key/non key + unordering Kj is a search key.
file) Prj is a data pointer (points to the actual data record
or block for key Kj).
At most one primary index per relation. Any number q is the number of keys in the node (q ≤ p).
of secondary indexes can be created on non-primary
key attributes. B - Tree Formulas:
A relation can have either a primary (sparse) index Order : P
or a clustered index, but not both on the same
attribute. Minimum Maximum
A clustered index determines the physical order of
records; hence only one clustered index per relation. Root ( BP) 2 P

If the number of levels = k, and we also access the Non - Root ( BP) ⌈P/2⌉ P
data block, Total block accesses = k +1
Then total number of block accesses to locate a record Root (Keys) 1 P-1
is k + 1
Non Root (Keys) ⌈P/2⌉ - 1 P-1

Multilevel Index
Order P:
A multilevel index treats the index file itself as an
P * BP + P(Keys + RP) <= Block size.
ordered file (first or base level). A second-level index
(P + 1) BP + P(Keys + RP) <= Block Size
is then created on this first-level index, acting like a
Note: Maximum level: At each level min # key & min #
primary index using block anchors—one entry per
Bp (⌈P/2⌉−1, ⌈P/2⌉)
block of the first level. Since all index entries are of the
Minimum level: At each level maximum # key & max #
same size, the blocking factor remains the same
Bp (P−1, P)
across all levels.
Applicable for both B and B+ Tree
B Tree
B+ Tree
All keys are available at leaf node

Page No:- 01
DBMS
GATE फर्रे

Internal node: No read pointer.


Leaf node contains Key & Record pointer + 1 Block
pointer

Structure of Internal node:


B1 K1 B2 K2 … Bp-1 Kp-1 Bp

Left biasing ⇒
If x ≤ K₁
If K₁ < x ≤ K₂

Right biasing ⇒
If x < K₁
If K₁ ≤ x < K₂

Formula: P * BP + (P - 1) * Key ≤ Block Size


Structure of Leaf node:
K1R1 K1R2 …. Kp-1 Rp-1 Bp
Formula: (P - 1) *⌈ Key + Rₚ⌉ + 1 * BP ≤ Block Size
In a B-tree, every value of the search field appears
once at some level in the tree, along with a data
pointer.
In a B+ tree, data pointers are stored only at the leaf
nodes of the tree; hence, the structure of leaf nodes
differs from the structure of internal nodes.
B+ Tree is best suited for sequential access to a range
of data records.

Page No:- 11
DBMS
GATE फर्रे

Chapter 5: Transaction and Concurrency Control System crash - If System crash/failure happen,
A transaction is a collection of operations that forms a required operation to recover are
single logical unit of work. (I) All committed transactions until the previous
checkpoint will perform Redo.
Operation in Transaction (II) All uncommitted transactions in the entire system
Read (A) - This operation transfers the current value of will perform undo.
data item A from the database (typically stored on disk) (III) Clean all log entries.
into a local variable in the transaction's main memory Roll back - Undo modification of database files which
workspace. It enables the transaction to access and use are done by failure transaction.
the value of A during its execution. Durability - If a transaction completes successfully and
the user is notified, the changes made by the
Write (A) - This operation stores the modified value of transaction must persist, regardless of what happens
data item A from the transaction's local workspace (in next.
main memory) back into the database (on disk), thereby
making the changes made by the transaction visible in Durability Mechanisms
the database.
Write-Ahead Logging (WAL): Log records are
BEGIN TRANSACTION - Marks the starting point of a written before actual data is updated.
transaction. Allocates resources and begins logging.
END TRANSACTION - Marks the logical end of the Redo Logging: During recovery, the system reapplies
transaction. Actual commit or abort happens later. the effects of committed transactions using the log.
Commit - Makes all changes made by the transaction
permanent in the database. Stable Storage: Assumes existence of a reliable
Abort - Undoes all changes made by the transaction storage medium that survives crashes (or is emulated
and restores the database to its previous consistent using replication, RAID, etc.).
state.
Consistency - It ensures all integrity constraints are
Atomicity : Atomicity ensures that a transaction is all- preserved.
or-nothing — either all operations of the transaction
are executed successfully, or none of them are. Isolation - Each transaction should appear to execute
in isolation, even when multiple transactions are
Goal: Prevent partial updates that can lead to data running concurrently.
inconsistency.
Isolation issues - Dirty read, Non-repeatable read,
Maintained by: Recovery subsystem using undo Phantom read.
(rollback) operations.
Schedules - Time order execution sequence of two or
Operation Purpose Affects more transactions.

Redo Reapply changes of Committed Schedules


committed Transactions
transactions
Schedule: Sequence of operations from multiple
transactions.
Undo Revert changes of Uncommitted Serial Schedule: One transaction executes completely
uncommitted Transactions
transactions before another.
Concurrent Schedule: Interleaved operations of
Checkpoint Limit redo/undo All transactions
transactions.
scope for recovery since last checkpoint
Serializable Schedule: Equivalent to a serial schedule.

Page No:- 12
DBMS
GATE फर्रे

Types of Serializability Schedule type Property


Conflict Serializability
Irrecoverable T2 reads dirty data from T1 and commits
Uses Precedence Graph.
before T1
No cycles ⇒ Conflict serializable. (CNC)
Conflict Operations: Recoverable T2 commits after T1 commits.

Read–Write Problem Cascading T2 reads from uncommitted T1; failure of T1


Rollback rolls back T2.
T1 T2
Cascadeless T2 reads data from committed T1 only.
W(A)
Strict Schedule No overwrites on uncommitted data.
R(A) Guarantees atomicity.

Concurrency Control with Locks


○ Write–Read Lock-Based Protocols
T1 T2 ● Locks ensure mutual exclusion, allowing only
one transaction to access a data item in a
R(A) conflicting mode at a time.

W(A)
● A transaction must acquire a lock on a data
item before accessing it.
○ Write–Write
● Locks are automatically released when the
T1 T2
transaction commits or aborts.
W(A)
● Locking protocols are sets of rules that
W(A) govern how locks are acquired and released to
ensure serializability.
View Serializability
● Three checks: ● Improper use of locks may lead to issues like
Initial read deadlocks, starvation, and reduced
Updated read concurrency.
Final write
● Locking restricts non-serializable schedules
● Conflict serializable ⇒ View serializable, but by design.
not vice versa.
Lock Granularity
Why Concurrent Execution? Granularity refers to the size of the data item that
● Increases throughput of the system can be locked. Coarser granularity reduces overhead
● Maximizes resource utilization but limits concurrency, while finer granularity increases
● Reduces waiting time for users overhead but allows higher concurrency.

T1 T2
Database-Level Locking
W(A) ● Locks the entire database.

R(A) ● Use Case: Suitable for batch jobs or bulk


updates.

Page No:- 13
DBMS
GATE फर्रे

● Drawback: Poor concurrency – only one ● Overhead: High, due to managing a large
transaction can access the DB. number of locks.

Table-Level Locking ● Common in high-concurrency systems like


● Locks the entire table. banking and real-time OLTP.

● Use Case: Useful when transactions access Field-Level Locking (Attribute-Level)


entire tables. ● Locks specific attributes/columns within a
tuple.
● Drawback: Reduces concurrency if multiple
transactions access different rows. ● Allows highest concurrency, since multiple
transactions can access different fields of the
Page-Level Locking same row.
● Locks a disk page (unit of storage, typically
4KB–8KB). ● Rarely used in practice due to:

● A page may hold multiple rows or parts of a ○ Complexity in implementation


table.
○ Very high CPU and memory overhead
● Use Case: Balanced trade-off between
concurrency and overhead. ● Mostly theoretical and not supported in
common DBMS implementations.
● Most popular in multi-user DBMS.
Lock types
Tuple/Row-Level Locking
● Locks individual tuples (rows). 1. Binary locks
0 → Unlocked (available)
● Use Case: Allows maximum concurrency. 1 → Locked (unavailable)
Every transaction must lock before accessing a data
● Drawback: Higher overhead (more locks, more item and unlock after operation.
tracking). Issues :
Doesn’t distinguish between read/write intent.
Attribute-Level Locking Leads to low concurrency.
● Locks a specific column/attribute. Susceptible to deadlocks and irrecoverability.

● Rarely used in practice due to high 2. Shared/Exclusive Locks


complexity. Shared (S Mode)
Allows read-only access to a data item.
Row-Level Locking Multiple transactions can hold shared locks
● Locks individual tuples (rows) in a table. simultaneously.
No conflict between transactions if all are reading
● Allows maximum concurrency, as multiple only.
transactions can access different rows. Conflict when transaction wants to read an exclusive
lock not held on item.
● Even if rows are on the same page, they can
be accessed in parallel.

Page No:- 14
DBMS
GATE फर्रे

Lock compatibility matrix Governing Rules of 2PL:


● A transaction may be granted a lock on a data
item if the lock mode is compatible with the For a transaction to follow Basic 2PL, the following
existing locks held by other transactions. must hold:
● Multiple transactions can simultaneously 1. No conflicting locks can be held by different
hold Shared (S) locks on the same item (read- transactions on the same data item.
only).
● If any transaction holds an Exclusive (X) lock, 2. No unlock operation can occur before all
no other transaction can obtain either shared required locks are acquired.
or exclusive lock on that item.
3. No data item is modified until all necessary
Request ↓ /Held → S X locks are successfully obtained.
Once a lock is released, no further locks can be
S Lock can be granted Lock request must requested — this defines the lock point of the
wait
transaction.
X Lock request must Lock request must
wait wait Basic 2PL Protocol
● Ensures conflict serializability.
Problem with locking
1. Non-Serializable Schedules ● Lock point determines equivalent serial order.
Problem: Concurrent execution might lead to
inconsistent results if not carefully managed. ● Does not ensure recoverability or cascading
Solution: Enforce serializability using two- abort freedom.
phase locking protocol. Problems:
2. Schedule may create deadlocks. ● Irrecoverable Schedules: Transactions might
Problem: Two or more transactions wait read uncommitted changes.
indefinitely for each other’s locks.
Solution: Use deadlock handling techniques: ● Deadlocks: Due to circular waits.
Deadlock detection and deadlock prevention
● Starvation: Transaction may never acquire
needed locks.
Issue Cause Solution

Non-Serializable Improper Two-Phase Locking Strict 2PL Protocol


Schedule interleaving (2PL) ● Extends Basic 2PL by holding all exclusive
Deadlock Circular wait on Detection or locks until commit/abort.
locks Prevention
Techniques
● Ensures:

2 Phase locking (2PL) : Two-Phase Locking ensures ○ Conflict serializability


conflict serializability by dividing the transaction's
locking behavior into two phases: ○ Strict recoverability (no cascading
Growing Phase: Transaction acquires locks (shared or aborts)
exclusive).
No lock can be released during this phase. ● Commonly used in real-world DBMS (e.g.,
Shrinking Phase : Oracle, SQL Server)
Transaction releases locks.
No new lock can be acquired once this phase begins.

Page No:- 01
DBMS
GATE फर्रे

Problems: └── Recoverable


● Starvation is still possible (solution: fairness └── Irrecoverable
policy).
All conflict serializable schedules are view
● Not deadlock-free serializable.
All strict schedules are cascadeless and recoverable.
NOTE: Rigorous 2PL ensures the strictest class: strict,
Every schedule allowed by 2PL is always conflict conflict-serializable, and recoverable
serializable.
But not every conflict serializable schedule is Rigorous 2PL (Two-Phase Locking) Protocol
allowed by 2PL. In Rigorous 2PL, a stricter version of 2PL:
Lock point of each transaction determines the serial All locks (both Shared S and Exclusive
order in 2PL. X) are held until the transaction
commits or aborts.
Rigorous Two-Phase Locking This guarantees strict schedules, which
Rules: are both conflict-serializable and
● A transaction holds all its locks (Shared or cascadeless.
Exclusive) until it commits or aborts. Strict 2PL holds only exclusive locks until
commit.
● Lock acquisition follows standard 2PL (growing Rigorous 2PL ⊆ Strict 2PL ⊆ Basic 2PL (in terms
then shrinking phase). of restrictiveness and guarantee of
serializability).
● Ensures:

○ Conflict serializability Timestamp Ordering Protocols


For each data item X, the system maintains:
○ Strict recoverability WTS(X): The largest timestamp of any
transaction that successfully wrote X.
○ Cascadeless schedules RTS(X): The largest timestamp of any
Still has: transaction that successfully read X.
● Deadlock possibility
Let TS(T) be the timestamp of transaction T.
● Starvation possibility (can be handled with 1. If T issues R(X) (read operation):
wait-time policies) If TS(T) < WTS(X):
Difference Strict and rigorous 2pl Reject the read; rollback T (as a newer
transaction has already written X).
Strict 2PL Rigorous 2PL
Else:
Holds exclusive locks Till commit Till commit Allow the read.
Update RTS(X) = max(RTS(X), TS(T)).
Holds shared locks May release Till commit
early 2. If T issues W(X) (write operation):
If TS(T) < RTS(X):
View Serializable Reject the write; rollback T (a younger
└── Conflict Serializable transaction has already read X).
└── Strict If TS(T) < WTS(X):
└── Cascadeless

Page No:- 16
DBMS
GATE फर्रे

Reject the write; rollback T (a younger Starvation is possible for younger transactions, as they
transaction has already written X). may be repeatedly rolled back if older transactions
Else: continuously block them.
Allow them to write.
Set WTS(X) = TS(T). 2. Wound Wait Protocol (Non- Preemptive)

Thomas's Write Rule (TWR) Transactions are ordered by their timestamps.


1. If T issues R(X): Older transactions have higher priority over
If WTS(X) > TS(T): younger ones.
rollback T. Transactions are again ordered by timestamp.
else If a transaction T₁ requests a lock held by T₂:
execute If TS(T₁) < TS(T₂) (i.e., T₁ is older):
Set RTS(X) = max{ TS(T), RTS(X)} → T₁ preempts T₂ — T₂ is
rolled back (wounded), and T₁ gets the lock.
2. If T issues W(X): If TS(T₁) > TS(T₂) (i.e., T₁ is younger):
If TS(T) < RTS(X): → T₁ waits.
Reject the write; rollback T.
If TS(T) < WTS(X):
Ignore the write (it is obsolete write). No deadlocks occur.
Else: Starvation is possible for younger transactions, as they
Perform the write. may be forced to wait if older transactions repeatedly
Set WTS(X) = TS(T). preempt them.

Strict Timestamp Ordering Protocol


A transaction Tj that issues a read or write (R(X) or
W(X)) operation must wait until the transaction Ti that
last wrote to X has committed or aborted, if TS(Tj) >
WTS(X).

Deadlock Prevention Using Timestamp-Based


Schemes

1. Wait Die protocol (Preemptive)


Transactions are ordered by their
timestamps.
Smaller timestamp → Older transaction
Larger timestamp → Younger transaction
Transactions are ordered by their timestamps:
smaller timestamp = older transaction.
If a transaction T₁ requests a lock held by T₂:
If TS(T₁) < TS(T₂) (i.e., T₁ is older):
→ T₁ waits.
If TS(T₁) > TS(T₂) (i.e., T₁ is younger):
→ T₁ is rolled back (dies) and
restarted with the same timestamp.
No deadlocks occur.

Page No:- 17

You might also like