0% found this document useful (0 votes)
15 views32 pages

Dbms Notes

The Entity-Relationship Model (ER Model) is a conceptual framework for designing databases, representing entities, their attributes, and relationships. It employs specific symbols to illustrate these components and includes concepts like strong and weak entities, various types of attributes, and cardinality in relationships. Additionally, it outlines Codd's rules for database management and the ACID properties to ensure data integrity and consistency in transactions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views32 pages

Dbms Notes

The Entity-Relationship Model (ER Model) is a conceptual framework for designing databases, representing entities, their attributes, and relationships. It employs specific symbols to illustrate these components and includes concepts like strong and weak entities, various types of attributes, and cardinality in relationships. Additionally, it outlines Codd's rules for database management and the ACID properties to ensure data integrity and consistency in transactions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Introduction of ER Model

The Entity-Relationship Model (ER Model) is a conceptual model for designing a databases. This
model represents the logical structure of a database, including entities, their attributes and
relationships between them.

• Entity: An objects that is stored as data such as Student, Course or Company.

• Attribute: Properties that describes an entity such as StudentID, CourseName,


or EmployeeEmail.

• Relationship: A connection between entities such as "a Student enrolls in a Course".

Symbols Used in ER Model


ER Model is used to model the logical view of the system from a data perspective which consists of
these symbols:

• Rectangles: Rectangles represent entities in the ER Model.

• Ellipses: Ellipses represent attributes in the ER Model.

• Diamond: Diamonds represent relationships among Entities.

• Lines: Lines represent attributes to entities and entity sets with other relationship types.

• Double Ellipse: Double ellipses represent multi-valued Attributes, such as a student's


multiple phone numbers

• Double Rectangle: Represents weak entities, which depend on other entities for
identification.
What is an Entity?
An Entity represents a real-world object, concept or thing about which data is stored in a database. It
act as a building block of a database. Tables in relational database represent these entities.

Example of entities:

• Real-World Objects: Person, Car, Employee etc.

• Concepts: Course, Event, Reservation etc.

• Things: Product, Document, Device etc.

The entity type defines the structure of an entity, while individual instances of that type represent
specific entities.

What is an Entity Set?


An entity refers to an individual object of an entity type, and the collection of all entities of a
particular type is called an entity set. For example, E1 is an entity that belongs to the entity type
"Student," and the group of all students forms the entity set.

In the ER diagram below, the entity type is represented as:

We can represent the entity sets in an ER Diagram but we can't represent individual entities because
an entity is like a row in a table, and an ER diagram shows the structure and relationships of data, not
specific data entries (like rows and columns). An ER diagram is a visual representation of the data
model, not the actual data itself.
Types of Entity

There are two main types of entities:

1. Strong Entity

A Strong Entity is a type of entity that has a key Attribute that can uniquely identify each instance of
the entity. A Strong Entity does not depend on any other Entity in the Schema for its identification. It
has a primary key that ensures its uniqueness and is represented by a rectangle in an ER diagram.

2. Weak Entity

A Weak Entity cannot be uniquely identified by its own attributes alone. It depends on a strong entity
to be identified. A weak entity is associated with an identifying entity (strong entity), which helps in
its identification. A weak entity are represented by a double rectangle. The participation of weak
entity types is always total. The relationship between the weak entity type and its identifying strong
entity type is called identifying relationship and it is represented by a double diamond.

Example:

A company may store the information of dependents (Parents, Children, Spouse) of an Employee. But
the dependents can't exist without the employee. So dependent will be a Weak Entity Type and
Employee will be identifying entity type for dependent, which means it is Strong Entity Type.

Attributes in ER Model
Attributes are the properties that define the entity type. For example, for a Student entity Roll_No,
Name, DOB, Age, Address, and Mobile_No are the attributes that define entity type Student. In ER
diagram, the attribute is represented by an oval.

Types of Attributes

1. Key Attribute

The attribute which uniquely identifies each entity in the entity set is called the key attribute. For
example, Roll_No will be unique for each student. In ER diagram, the key attribute is represented by
an oval with an underline.

2. Composite Attribute
An attribute composed of many other attributes is called a composite attribute. For example, the
Address attribute of the student Entity type consists of Street, City, State, and Country. In ER diagram,
the composite attribute is represented by an oval comprising of ovals.

3. Multivalued Attribute

An attribute consisting of more than one value for a given entity. For example, Phone_No (can be
more than one for a given student). In ER diagram, a multivalued attribute is represented by a double
oval.

4. Derived Attribute

An attribute that can be derived from other attributes of the entity type is known as a derived
attribute. e.g.; Age (can be derived from DOB). In ER diagram, the derived attribute is represented by
a dashed oval.

The Complete Entity Type Student with its Attributes can be represented as:

Relationship Type and Relationship Set


A Relationship Type represents the association between entity types. For example, ‘Enrolled in’ is a
relationship type that exists between entity type Student and Course. In ER diagram, the relationship
type is represented by a diamond and connecting the entities with lines.

A set of relationships of the same type is known as a relationship set. The following relationship set
depicts S1 as enrolled in C2, S2 as enrolled in C1, and S3 as registered in C3.

Degree of a Relationship Set


The number of different entity sets participating in a relationship set is called the degree of a
relationship set.

1. Unary Relationship: When there is only ONE entity set participating in a relation, the relationship
is called a unary relationship. For example, one person is married to only one person.

2. Binary Relationship: When there are TWO entities set participating in a relationship, the
relationship is called a binary relationship. For example, a Student is enrolled in a Course.

3. Ternary Relationship: When there are three entity sets participating in a relationship, the
relationship is called a ternary relationship.

4. N-ary Relationship: When there are n entities set participating in a relationship, the relationship is
called an n-ary relationship.
Cardinality in ER Model
The maximum number of times an entity of an entity set participates in a relationship set is known
as cardinality.

Cardinality can be of different types:

1. One-to-One

When each entity in each entity set can take part only once in the relationship, the cardinality is one-
to-one. Let us assume that a male can marry one female and a female can marry one male. So the
relationship will be one-to-one.

Using Sets, it can be represented as:

2. One-to-Many

In one-to-many mapping as well where each entity can be related to more than one entity. Let us
assume that one surgeon department can accommodate many doctors. So the Cardinality will be 1
to M. It means one department has many Doctors.

Using sets, one-to-many cardinality can be represented as:


3. Many-to-One

When entities in one entity set can take part only once in the relationship set and entities in other
entity sets can take part more than once in the relationship set, cardinality is many to one.

Let us assume that a student can take only one course but one course can be taken by many
students. So the cardinality will be n to 1. It means that for one course there can be n students but
for one student, there will be only one course.

Using Sets, it can be represented as:

In this case, each student is taking only 1 course but 1 course has been taken by many students.

4. Many-to-Many
When entities in all entity sets can take part more than once in the relationship cardinality is many to
many. Let us assume that a student can take more than one course and one course can be taken by
many students. So the relationship will be many to many.

Using Sets, it can be represented as:

In this example, student S1 is enrolled in C1 and C3 and Course C3 is enrolled by S1, S3, and S4. So it
is many-to-many relationships.

How to Draw an ER Diagram


1. Identify Entities: The very first step is to identify all the Entities. Represent these entities in a
Rectangle and label them accordingly.

2. Identify Relationships: The next step is to identify the relationship between them and represent
them accordingly using the Diamond shape. Ensure that relationships are not directly connected to
each other.

3. Add Attributes: Attach attributes to the entities by using ovals. Each entity can have multiple
attributes (such as name, age, etc.), which are connected to the respective entity.

4. Define Primary Keys: Assign primary keys to each entity. These are unique identifiers that help
distinguish each instance of the entity. Represent them with underlined attributes.

5. Remove Redundancies: Review the diagram and eliminate unnecessary or repetitive entities and
relationships.

6. Review for Clarity: Review the diagram make sure it is clear and effectively conveys the
relationships between the entities.
Codd's Rules in DBMS
Codd's rules are proposed by a computer scientist named Dr. Edgar F. Codd and he also invent the
relational model for database management. These rules are made to ensure data integrity,
consistency, and usability. This set of rules basically signifies the characteristics and requirements of a
relational database management system (RDBMS).

Codd's Rules in DBMS


Rule 1: The Information Rule

All information, whether it is user information or metadata, that is stored in a database must be
entered as a value in a cell of a table. It is said that everything within the database is organized in a
table layout.

Rule 2: The Guaranteed Access Rule

Each data element is guaranteed to be accessible logically with a combination of the table name,
primary key (row value), and attribute name (column value).

Rule 3: Systematic Treatment of NULL Values

Every Null value in a database must be given a systematic and uniform treatment.

Rule 4: Active Online Catalog Rule

The database catalog, which contains metadata about the database, must be stored and accessed
using the same relational database management system.

Rule 5: The Comprehensive Data Sublanguage Rule

A crucial component of any efficient database system is its ability to offer an easily understandable
data manipulation language (DML) that facilitates defining, querying, and modifying information
within the database.

Rule 6: The View Updating Rule

All views that are theoretically updatable must also be updatable by the system.

Rule 7: High-level Insert, Update, and Delete

A successful database system must possess the feature of facilitating high-level insertions, updates,
and deletions that can grant users the ability to conduct these operations with ease through a single
query.

Rule 8: Physical Data Independence

Application programs and activities should remain unaffected when changes are made to the
physical storage structures or methods.

Rule 9: Logical Data Independence

Application programs and activities should remain unaffected when changes are made to the logical
structure of the data, such as adding or modifying tables.

Rule 10: Integrity Independence


Integrity constraints should be specified separately from application programs and stored in the
catalog. They should be automatically enforced by the database system.

Rule 11: Distribution Independence

The distribution of data across multiple locations should be invisible to users, and the database
system should handle the distribution transparently.

Rule 12: Non-Subversion Rule

If the interface of the system is providing access to low-level records, then the interface must not be
able to damage the system and bypass security and integrity constraints.

ACID Properties in DBMS


In the world of DBMS, transactions are fundamental operations that allow us to modify and retrieve
data. However, to ensure the integrity of a database, it is important that these transactions are
executed in a way that maintains consistency, correctness, and reliability. This is where the ACID
properties come into play.

ACID stands for Atomicity, Consistency, Isolation, and Durability. These four key properties define
how a transaction should be processed in a reliable and predictable manner, ensuring that the
database remains consistent, even in cases of failures or concurrent accesses.

The Four ACID Properties

1. Atomicity

Atomicity means a transaction is all-or-nothing either all its operations succeed, or none are applied.
If any part fails, the entire transaction is rolled back to keep the database consistent.

• Commit: If the transaction is successful, the changes are permanently applied.

• Abort/Rollback: If the transaction fails, any changes made during the transaction are
discarded.

Example: Consider the following transaction T consisting of T1 and T2 : Transfer of $100 from
account X to account Y .
If the transaction fails after completion of T1 but before completion of T2, the database would be left
in an inconsistent state. With Atomicity, if any part of the transaction fails, the entire process is rolled
back to its original state, and no partial changes are made.

2. Consistency

Consistency in transactions means that the database must remain in a valid state before and after a
transaction.

• A valid state follows all defined rules, constraints, and relationships (like primary keys,
foreign keys, etc.).

• If a transaction violates any of these rules, it is rolled back to prevent corrupt or invalid data.

• If a transaction deducts money from one account but doesn't add it to another (in a
transfer), it violates consistency.

Example: Suppose the sum of all balances in a bank system should always be constant. Before a
transfer, the total balance is $700. After the transaction, the total balance should remain $700. If the
transaction fails in the middle (like updating one account but not the other), the system should
maintain its consistency by rolling back the transaction.

Total before T occurs = 500 + 200 = 700 .


Total after T occurs = 400 + 300 = 700 .

3. Isolation

Isolation ensures that transactions run independently without affecting each other. Changes made by
one transaction are not visible to others until they are committed.
It ensures that the result of concurrent transactions is the same as if they were run one after
another, preventing issues like:

• Dirty reads: reading uncommitted data

• Non-repeatable reads: data changes between two reads

• Phantom reads: new rows appear during a transaction

Example: Consider two transactions T and T''.

• X = 500, Y = 500

Explanation:

1. Transaction T:

• T wants to transfer $50 from X to Y.

• T reads Y (value: 500), deducts $50 from X (new X = 450), and adds $50 to Y (new Y = 550).

2. Transaction T'':

• T'' starts and reads X (500) and Y (500).

• It calculates the sum: 500 + 500 = 1000.

• Meanwhile, values of X and Y change to 450 and 550 respectively.

• So, the correct sum should be 450 + 550 = 1000.

• Isolation ensures that T'' does not read outdated values while another transaction (T) is still
in progress.

• Transactions should be independent, and T'' should access the final values only after T
commits.

• This avoids inconsistent results, like the incorrect sum calculated by T''.

4. Durability:

Durability ensures that once a transaction is committed, its changes are permanently saved, even if
the system fails. The data is stored in non-volatile memory, so the database can recover to its last
committed state without losing data.
Example: After successfully transferring money from Account A to Account B, the changes are stored
on disk. Even if there is a crash immediately after the commit, the transfer details will still be intact
when the system recovers, ensuring durability.

How ACID Properties Impact DBMS Design and Operation

The ACID properties, in totality, provide a mechanism to ensure the correctness and consistency of a
database in a way such that each transaction is a group of operations that acts as a single unit,
produces consistent results, acts in isolation from other operations, and updates that it makes are
durably stored.

1. Data Integrity and Consistency

ACID properties safeguard the data integrity of a DBMS by ensuring that transactions either complete
successfully or leave no trace if interrupted. They prevent partial updates from corrupting the data
and ensure that the database transitions only between valid states.

2. Concurrency Control

ACID properties provide a solid framework for managing concurrent transactions. Isolation ensures
that transactions do not interfere with each other, preventing data anomalies such as lost updates,
temporary inconsistency, and uncommitted data.

3. Recovery and Fault Tolerance

Durability ensures that even if a system crashes, the database can recover to a consistent state.
Thanks to the Atomicity and Durability properties, if a transaction fails midway, the database remains
in a consistent state.

Property Responsibility for maintaining properties

Atomicity Transaction Manager

Consistency Application programmer

Isolation Concurrency Control Manager

Durability Recovery

Advantages of ACID Properties in DBMS

• Data Consistency: ACID properties ensure that the data remains consistent and accurate
after any transaction execution.

• Data Integrity: It maintains the integrity of the data by ensuring that any changes to the
database are permanent and cannot be lost.
• Concurrency Control: ACID properties help to manage multiple transactions occurring
concurrently by preventing interference between them.

• Recovery: ACID properties ensure that in case of any failure or crash, the system can recover
the data up to the point of failure or crash.

Disadvantages of ACID Properties in DBMS

• Performance Overhead: ACID properties can introduce performance costs, especially when
enforcing isolation between transactions or ensuring atomicity.

• Complexity: Maintaining ACID properties in distributed systems (like microservices or cloud


environments) can be complex and may require sophisticated solutions like distributed
locking or transaction coordination.

• Scalability Issues: ACID properties can pose scalability challenges, particularly in systems
with high transaction volumes, where traditional relational databases may struggle under
load.

Critical Use Cases for ACID in Databases

In modern applications, ensuring the reliability and consistency of data is crucial. ACID properties are
fundamental in sectors like:

• Banking: Transactions involving money transfers, deposits, or withdrawals must maintain


strict consistency and durability to prevent errors and fraud.

• E-commerce: Ensuring that inventory counts, orders, and customer details are handled
correctly and consistently, even during high traffic, requires ACID compliance.

• Healthcare: Patient records, test results, and prescriptions must adhere to strict consistency,
integrity, and security standards.

Concurrency Control in DBMS


In a database management system (DBMS), allowing transactions to run concurrently has significant
advantages, such as better system resource utilization and higher throughput. However, it is crucial
that these transactions do not conflict with each other. The ultimate goal is to ensure that the
database remains consistent and accurate. For instance, if two users try to book the last available
seat on a flight at the same time, the system must ensure that only one booking
succeeds.Concurrency control is a critical mechanism in DBMS that ensures the consistency and
integrity of data when multiple operations are performed at the same time.

Concurrent Execution and Related Challenges in DBMS

In a multi-user system, several users can access and work on the same database at the same time.
This is known as concurrent execution, where the database is used simultaneously by different users
for various operations. For instance, one user might be updating data while another is retrieving it.
When multiple transactions are performed on the database simultaneously, it is important that these
operations are executed in an interleaved manner. This means that the actions of one user should
not interfere with or affect the actions of another. This helps in maintaining the consistency of the
database. However, managing such simultaneous operations can be challenging, and certain
problems may arise if not handled properly. These challenges need to be addressed to ensure
smooth and error-free concurrent execution.

Concurrent Execution can lead to various challenges:

• Dirty Reads: One transaction reads uncommitted data from another transaction, leading to
potential inconsistencies if the changes are later rolled back.

• Lost Updates: When two or more transactions update the same data simultaneously, one
update may overwrite the other, causing data loss.

• Inconsistent Reads: A transaction may read the same data multiple times during its
execution, and the data might change between reads due to another transaction, leading to
inconsistency.

Concurrency Control Protocols


Concurrency control protocols are the set of rules which are maintained in order
to solve the concurrency control problems in the database. It ensures that the
concurrent transactions can execute properly while maintaining the database
consistency. The concurrent execution of a transaction is provided with atomicity,
consistency, isolation, durability, and serializability via the concurrency control
protocols.

Lock Based Concurrency Control Protocol in DBMS


A lock in DBMS controls concurrent access, allowing only one transaction to use a data item at a
time. This ensures data integrity and prevents issues like lost updates or dirty reads during
simultaneous transactions.
Lock Based Protocols in DBMS ensure that a transaction cannot read or write data until it gets the
necessary lock. Here's how they work:

• These protocols prevent concurrency issues by allowing only one transaction to access a
specific data item at a time.

• Locks help multiple transactions work together smoothly by managing access to the
database items.

• Locking is a common method used to maintain the serializability of transactions.

• A transaction must acquire a read lock or write lock on a data item before performing any
read or write operations on it.

Types of Lock

1. Shared Lock (S): Shared Lock is also known as Read-only lock. As the name suggests it can be
shared between transactions because while holding this lock the transaction does not have
the permission to update data on the data item. S-lock is requested using lock-S instruction.

2. Exclusive Lock (X): Data item can be both read as well as written. This is Exclusive and cannot
be held simultaneously on the same data item. X-lock is requested using lock-X instruction.

Lock Compatibility Matrix

• A transaction can acquire a lock on a data item only if the requested lock is compatible with
existing locks held by other transactions.

• Shared Locks (S): Multiple transactions can hold shared locks on the same data item
simultaneously.

• Exclusive Lock (X): If a transaction holds an exclusive lock on a data item, no other
transaction can hold any type of lock on that item.

• If a requested lock is not compatible, the requesting transaction must wait until all
incompatible locks are released by other transactions.

• Once the incompatible locks are released, the requested lock is granted.

Types of Lock-Based Protocols

1. Simplistic Lock Protocol

It is the simplest method for locking data during a transaction. Simple lock-based protocols enable all
transactions to obtain a lock on the data before inserting, deleting, or updating it. It will unlock the
data item once the transaction is completed.

Example: Consider a database with a single data item X = 10.


Transactions:

• T1: Wants to read and update X.

• T2: Wants to read X.

2. Pre-Claiming Lock Protocol

The Pre-Claiming Lock Protocol avoids deadlocks by requiring a transaction to request all needed
locks before it starts. It runs only if all locks are granted; otherwise, it waits or rolls back.

Example: Consider two transactions T1 and T2 and two data items, X and Y:

Transaction T1 declares that it needs:

• A write lock on X.

• A read lock on Y.

3. Two-phase locking (2PL)

A transaction is said to follow the Two-Phase Locking protocol if Locking and Unlocking can be done
in two phases :

• Growing Phase: New locks on data items may be acquired but none can be released.

• Shrinking Phase: Existing locks may be released but no new locks can be acquired.

4. Strict Two-Phase Locking Protocol

Strict Two-Phase Locking requires that in addition to the 2-PL all Exclusive(X) locks held by the
transaction be released until after the Transaction Commits.

Timestamp based Concurrency Control


Timestamp-based concurrency control is a technique used in database management systems (DBMS)
to ensure serializability of transactions without using locks. It uses timestamps to determine the
order of transaction execution and ensures that conflicting operations follow a consistent order.

Each transaction T is assigned a unique timestamp TS(T) when it enters the system. This timestamp
determines the transaction’s place in the execution order.

Timestamp Ordering Protocol

The Timestamp Ordering Protocol enforces that older transactions (with smaller timestamps) are
given higher priority. This prevents conflicts and ensures the execution is serializable and deadlock-
free.

For example:

• If Transaction T1 enters the system first, it gets a timestamp TS(T1) = 007 (assumption).

• If Transaction T2 enters after T1, it gets a timestamp TS(T2) = 009 (assumption).

This means T1 is "older" than T2 and T1 should execute before T2 to maintain consistency.

How Timestamp Ordering Works


Each data item X in the database keeps two timestamps:

• W_TS(X): Timestamp of the last transaction that wrote to X

• R_TS(X): Timestamp of the last transaction that read from X

1.Basic Timestamp Ordering


The Basic TO Protocol works by comparing the timestamp of the current transaction with the
timestamps on the data items it wants to read/write:

Precedence Graph for TS ordering

• Suppose, if an old transaction Ti has timestamp TS(Ti), a new transaction Tj is assigned


timestamp TS(Tj) such that TS(Ti) < TS(Tj).

• The protocol manages concurrent execution such that the timestamps determine the
serializability order.

• The timestamp ordering protocol ensures that any conflicting read and write operations are
executed in timestamp order.

• Whenever some Transaction T tries to issue a R_item(X) or a W_item(X), the Basic TO


algorithm compares the timestamp of T with R_TS(X) & W_TS(X) to ensure that the
Timestamp order is not violated.
2.Strict Timestamp Ordering Protocol
The Strict Timestamp Ordering Protocol is an enhanced version that avoids cascading rollbacks by
delaying operations until it's safe to execute them.

Rules for Read Operation R_item(X):


T can read X only if:
• W_TS(X) ≤ TS(T) and

• The transaction that last wrote X has committed

Rules for Write Operation W_item(X):


T can write X only if:
• R_TS(X) ≤ TS(T) and W_TS(X) ≤ TS(T) and

• All previous readers/writers of X have committed

If these conditions aren't met, the operation is delayed (not aborted immediately).

What is Normalization in DBMS?


Normalization is a systematic approach to organize data within a database to reduce redundancy and
eliminate undesirable characteristics such as insertion, update, and deletion anomalies. The process
involves breaking down large tables into smaller, well-structured ones and defining relationships
between them. This not only reduces the chances of storing duplicate data but also improves the
overall efficiency of the database.
Why is Normalization Important?

• Reduces Data Redundancy: Duplicate data is stored efficiently, saving disk space and
reducing inconsistency.

• Improves Data Integrity: Ensures the accuracy and consistency of data by organizing it in a
structured manner.

• Simplifies Database Design: By following a clear structure, database designs become easier
to maintain and update.

• Optimizes Performance: Reduces the chance of anomalies and increases the efficiency of
database operations.

What are Normal Forms in DBMS?

Normalization is a technique used in database design to reduce redundancy and improve data
integrity by organizing data into tables and ensuring proper relationships. Normal Forms are different
stages of normalization, and each stage imposes certain rules to improve the structure and
performance of a database. Let's break down the various normal forms step-by-step to understand
the conditions that need to be satisfied at each level:

1. First Normal Form (1NF): Eliminating Duplicate Records

A table is in 1NF if it satisfies the following conditions:

• All columns contain atomic values (i.e., indivisible values).

• Each row is unique (i.e., no duplicate rows).

• Each column has a unique name.

• The order in which data is stored does not matter.

Example of 1NF Violation: If a table has a column "Phone Numbers" that stores multiple phone
numbers in a single cell, it violates 1NF. To bring it into 1NF, you need to separate phone numbers
into individual rows.

2. Second Normal Form (2NF): Eliminating Partial Dependency


A relation is in 2NF if it satisfies the conditions of 1NF and additionally. No partial dependency exists,
meaning every non-prime attribute (non-key attribute) must depend on the entire primary key, not
just a part of it.

Example: For a composite key (StudentID, CourseID), if the StudentName depends only
on StudentID and not on the entire key, it violates 2NF. To normalize, move StudentName into a
separate table where it depends only on StudentID.

3. Third Normal Form (3NF): Eliminating Transitive Dependency

A relation is in 3NF if it satisfies 2NF and additionally, there are no transitive dependencies. In simpler
terms, non-prime attributes should not depend on other non-prime attributes.

Example: Consider a table with (StudentID, CourseID, Instructor). If Instructor depends on CourseID,
and CourseID depends on StudentID, then Instructor indirectly depends on StudentID, which
violates 3NF. To resolve this, place Instructor in a separate table linked by CourseID.

4. Boyce-Codd Normal Form (BCNF): The Strongest Form of 3NF

BCNF is a stricter version of 3NF where for every non-trivial functional dependency (X → Y), X must
be a superkey (a unique identifier for a record in the table).

Example: If a table has a dependency (StudentID, CourseID) → Instructor, but neither StudentID nor
CourseID is a superkey, then it violates BCNF. To bring it into BCNF, decompose the table so that each
determinant is a candidate key.

5. Fourth Normal Form (4NF): Removing Multi-Valued Dependencies

A table is in 4NF if it is in BCNF and has no multi-valued dependencies. A multi-valued dependency


occurs when one attribute determines another, and both attributes are independent of all other
attributes in the table.

Example: Consider a table where (StudentID, Language, Hobby) are attributes. If a student can have
multiple hobbies and languages, a multi-valued dependency exists. To resolve this, split the table
into separate tables for Languages and Hobbies.

6. Fifth Normal Form (5NF): Eliminating Join Dependency

5NF is achieved when a table is in 4NF and all join dependencies are removed. This form ensures that
every table is fully decomposed into smaller tables that are logically connected without losing
information.

Example: If a table contains (StudentID, Course, Instructor) and there is a dependency where all
combinations of these columns are needed for a specific relationship, you would split them into
smaller tables to remove redundancy.

Advantages of Normal Form

1. Reduced data redundancy: Normalization helps to eliminate duplicate data in tables, reducing the
amount of storage space needed and improving database efficiency.

2. Improved data consistency: Normalization ensures that data is stored in a consistent and
organized manner, reducing the risk of data inconsistencies and errors.
3. Simplified database design: Normalization provides guidelines for organizing tables and data
relationships, making it easier to design and maintain a database.

4. Improved query performance: Normalized tables are typically easier to search and retrieve data
from, resulting in faster query performance.

5. Easier database maintenance: Normalization reduces the complexity of a database by breaking it


down into smaller, more manageable tables, making it easier to add, modify, and delete data.

Common Challenges of Over-Normalization

While normalization is a powerful tool for optimizing databases, it's important not to over-
normalize your data. Excessive normalization can lead to:

• Complex Queries: Too many tables may result in multiple joins, making queries slow and
difficult to manage.

• Performance Overhead: Additional processing required for joins in overly normalized


databases may hurt performance, especially in large-scale systems.

In many cases, denormalization (combining tables to reduce the need for complex joins) is used for
performance optimization in specific applications, such as reporting systems.

When to Use Normalization and Denormalization

• Normalization is best suited for transactional systems where data integrity is paramount,
such as banking systems and enterprise applications.

• Denormalization is ideal for read-heavy applications like data warehousing and reporting
systems where performance and query speed are more critical than data integrity.

Applications of Normal Forms in DBMS

• Ensures Data Consistency:Prevents data anomalies by ensuring each piece of data is stored
in one place, reducing inconsistencies.

• Reduces Data Redundancy: Minimizes repetitive data, saving storage space and avoiding
errors in data updates or deletions.

• Improves Query Performance: Simplifies queries by breaking large tables into smaller, more
manageable ones, leading to faster data retrieval.

• Enhances Data Integrity: Ensures that data is accurate and reliable by adhering to defined
relationships and constraints between tables.

• Easier Database Maintenance: Simplifies updates, deletions, and modifications by ensuring


that changes only need to be made in one place, reducing the risk of errors.

• Facilitates Scalability: Makes it easier to modify, expand, or scale the database structure as
business requirements grow.

• Supports Better Data Modeling: Helps in designing databases that are logically structured,
with clear relationships between tables, making it easier to understand and manage.

• Reduces Update Anomalies: Prevents issues like insertion, deletion, or modification


anomalies that can arise from redundant data.
• Improves Data Integrity and Security: By reducing unnecessary data duplication, normal
forms help ensure sensitive information is securely and correctly maintained.

• Optimizes Storage Efficiency: By organizing data into smaller tables, storage is used more
efficiently, reducing the overhead for large databases

DBMS Architecture 1-level, 2-Level, 3-Level


A DBMS architecture defines how users interact with the database to read, write, or update
information. A well-designed architecture and schema (a blueprint detailing tables, fields and
relationships) ensure data consistency, improve performance and keep data secure.

Types of DBMS Architecture

There are several types of DBMS Architecture that we use according to the usage requirements.

• 1-Tier Architecture

• 2-Tier Architecture

• 3-Tier Architecture

1-Tier Architecture

In 1-Tier Architecture, the user works directly with the database on the same system. This means the
client, server and database are all in one application. The user can open the application, interact with
the data and perform tasks without needing a separate server or network connection.

DBMS 1-Tier Architecture

• A common example is Microsoft Excel. Everything from the user interface to the logic and
data storage happens on the same device. The user enters data, performs calculations and
saves files directly on their computer.

• This setup is simple and easy to use, making it ideal for personal or standalone applications.
It does not require a network or complex setup, which is why it's often used in small-scale or
individual use cases.

• This architecture is simple and works well for personal, standalone applications where no
external server or network connection is needed.

Advantages of 1-Tier Architecture

Below mentioned are the advantages of 1-Tier Architecture.


• Simple Architecture: 1-Tier Architecture is the most simple architecture to set up, as only a
single machine is required to maintain it.

• Cost-Effective: No additional hardware is required for implementing 1-Tier Architecture,


which makes it cost-effective.

• Easy to Implement: 1-Tier Architecture can be easily deployed and hence it is mostly used in
small projects.

Disadvantages of 1-Tier Architecture

• Limited to Single User: Only one person can use the application at a time. It’s not designed
for multiple users or teamwork.

• Poor Security: Since everything is on the same machine, if someone gets access to the
system, they can access both the data and the application easily.

• No Centralized Control: Data is stored locally, so there's no central database. This makes it
hard to manage or back up data across multiple devices.

• Hard to Share Data: Sharing data between users is difficult because everything is stored on
one computer.

2-Tier Architecture

The 2-tier architecture is similar to a basic client-server model. The application at the client end
directly communicates with the database on the server side. APIs like ODBC and JDBC are used for
this interaction. The server side is responsible for providing query processing and transaction
management functionalities.

DBMS 2-Tier Architecture

• On the client side, the user interfaces and application programs are run. The application on
the client side establishes a connection with the server side to communicate with the DBMS.
For Example: A Library Management System used in schools or small organizations is a classic
example of two-tier architecture.

• Client Layer (Tier 1): This is the user interface that library staff or users interact with. For
example they might use a desktop application to search for books, issue them, or check due
dates.
• Database Layer (Tier 2): The database server stores all the library records such as book
details, user information and transaction logs.

• The client layer sends a request (like searching for a book) to the database layer which
processes it and sends back the result. This separation allows the client to focus on the user
interface, while the server handles data storage and retrieval.

Advantages of 2-Tier Architecture

• Easy to Access: 2-Tier Architecture makes easy access to the database, which makes fast
retrieval.

• Scalable: We can scale the database easily, by adding clients or upgrading hardware.

• Low Cost: 2-Tier Architecture is cheaper than 3-Tier Architecture and Multi-Tier Architecture.

• Easy Deployment: 2-Tier Architecture is easier to deploy than 3-Tier Architecture.

• Simple: 2-Tier Architecture is easily understandable as well as simple because of only two
components.

Disadvantages of 2-Tier Architecture

• Limited Scalability: As the number of users increases, the system performance can slow
down because the server gets overloaded with too many requests.

• Security Issues: Clients connect directly to the database, which can make the system more
vulnerable to attacks or data leaks.

• Tight Coupling: The client and the server are closely linked. If the database changes, the
client application often needs to be updated too.

• Difficult Maintenance: Managing updates, fixing bugs, or adding features becomes harder
when the number of users or systems increases.

3-Tier Architecture

In 3-Tier Architecture, there is another layer between the client and the server. The client does not
directly communicate with the server. Instead, it interacts with an application server which further
communicates with the database system and then the query processing and transaction
management takes place. This intermediate layer acts as a medium for the exchange of partially
processed data between the server and the client. This type of architecture is used in the case of
large web applications.
DBMS 3-Tier Architecture

Example: E-commerce Store

• User: You visit an online store, search for a product and add it to your cart.

• Processing: The system checks if the product is in stock, calculates the total price and applies
any discounts.

• Database: The product details, your cart and order history are stored in the database for
future reference.

Advantages of 3-Tier Architecture

• Enhanced scalability: Scalability is enhanced due to the distributed deployment of


application servers. Now, individual connections need not be made between the client and
server.

• Data Integrity: 3-Tier Architecture maintains Data Integrity. Since there is a middle layer
between the client and the server, data corruption can be avoided/removed.

• Security: 3-Tier Architecture Improves Security. This type of model prevents direct
interaction of the client with the server thereby reducing access to unauthorized data.

Disadvantages of 3-Tier Architecture

• More Complex: 3-Tier Architecture is more complex in comparison to 2-Tier Architecture.


Communication Points are also doubled in 3-Tier Architecture.

• Difficult to Interact: It becomes difficult for this sort of interaction to take place due to the
presence of middle layers.

• Slower Response Time: Since the request passes through an extra layer (application server),
it may take more time to get a response compared to 2-Tier systems.

• Higher Cost: Setting up and maintaining three separate layers (client, server and database)
requires more hardware, software and skilled people. This makes it more expensive.
Transaction in DBMS
A transaction refers to a sequence of one or more operations (such as read, write, update, or delete)
performed on the database as a single logical unit of work.

• A transaction ensures that either all the operations are successfully executed (committed) or
none of them take effect (rolled back).

• Transactions are designed to maintain the integrity, consistency and reliability of the
database, even in the case of system failures or concurrent access.

All types of database access operation which are held between the beginning and end transaction
statements are considered as a single logical transaction. During the transaction the database is
inconsistent. Only once the database is committed the state is changed from one consistent state to
another.

Example: Let’s consider an online banking application:

Transaction: When a user performs a money transfer, several operations occur, such as:

• Reading the account balance of the sender.

• Writing the deducted amount from the sender’s account.

• Writing the added amount to the recipient’s account.

In a transaction, all these steps should either complete successfully or, if any error occurs, the
database should rollback to its previous state, ensuring no partial data is written to the system.

Facts about Database Transactions

• A transaction is a program unit whose execution may or may not change the contents of a
database.

• The transaction is executed as a single unit.

• If the database operations do not update the database but only retrieve data, this type of
transaction is called a read-only transaction.
• A successful transaction can change the database from one CONSISTENT STATE to another.

• DBMS transactions must be atomic, consistent, isolated and durable.

• If the database were in an inconsistent state before a transaction, it would remain in the
inconsistent state after the transaction.

Operations of Transaction

A user can make different types of requests to access and modify the contents of a database. So, we
have different types of operations relating to a transaction. They are discussed as follows:

1) Read(X)

A read operation is used to read the value of a particular database element X and stores it in a
temporary buffer in the main memory for further actions such as displaying that value.

Example: For a banking system, when a user checks their balance, a Read operation is performed on
their account balance:

SELECT balance FROM accounts WHERE account_id = 'A123';

This updates the balance of the user's account after withdrawal.

2) Write(X)

A write operation stores updated data from main memory back to the database. It usually follows a
read, where data is fetched, modified (e.g., arithmetic changes), and then written back to save the
updated value.

Example: For the banking system, if a user withdraws money, a Write operation is performed after
the balance is updated:

UPDATE accounts SET balance = balance - 100 WHERE account_id = 'A123';

This updates the balance of the user’s account after withdrawal.

3) Commit

This operation in transactions is used to maintain integrity in the database. Due to some failure of
power, hardware, or software, etc., a transaction might get interrupted before all its operations are
completed. This may cause ambiguity in the database, i.e. it might get inconsistent before and after
the transaction.

Example: After a successful money transfer in a banking system, a Commit operation finalizes the
transaction:

COMMIT;

Once the transaction is committed, the changes to the database are permanent, and the transaction
is considered successful.

4) Rollback

A rollback undoes all changes made by a transaction if an error occurs, restoring the database to its
last consistent state. It helps prevent data inconsistency and ensures safety.
Example: Suppose during the money transfer process, the system encounters an issue, like
insufficient funds in the sender’s account. In that case, the transaction is rolled back:

ROLLBACK;

This will undo all the operations performed so far and ensure that the database remains consistent.

ACID Properties of Transaction

Transactions in DBMS must ensure data is accurate and reliable. They follow four key ACID
properties:

1. Atomicity: A transaction is all or nothing. If any part fails, the entire transaction is rolled
back. Example: While transferring money, both debit and credit must succeed. If one fails,
nothing should change.

2. Consistency: A transaction must keep the database in a valid state, moving it from one
consistent state to another. Example: If balance is ₹1000 and ₹200 is withdrawn, the new
balance should be ₹800.

3. Isolation: Transactions run independently. One transaction’s operations should not affect
another’s intermediate steps. Example: Two users withdrawing from the same account must
not interfere with each other’s balance updates.

4. Durability: Once a transaction is committed, its changes stay even if the system crashes.
Example: After a successful transfer, the updated balance remains safe despite a power
failure.

A transaction can include the following basic database access operation.

• Read/Access data (R): Accessing the database item from disk (where the database stored
data) to memory variable.

• Write/Change data (W): Write the data item from the memory variable to the disk.

• Commit: Commit is a transaction control language that is used to permanently save the
changes done in a transaction

Example: Transfer of 50₹ from Account A to Account B. Initially A= 500₹, B= 800₹. This data is
brought to RAM from Hard Disk.

R(A) -- 500 // Accessed from RAM.


A = A-50 // Deducting 50₹ from A.
W(A)--450 // Updated in RAM.
R(B) -- 800 // Accessed from RAM.
B=B+50 // 50₹ is added to B's Account.
W(B) --850 // Updated in RAM.
commit // The data in RAM is taken back to Hard Disk.

Note: The updated value of Account A = 450₹ and Account B = 850₹.

All instructions before committing come under a partially committed state and are stored in RAM.
When the commit is read the data is fully accepted and is stored on a Hard Disk.
If the transaction is failed anywhere before committing we have to go back and start from the
beginning. We can't continue from the same state. This is known as Roll Back.

Desirable Properties of Transaction (ACID Properties)

Transaction management in a Database Management System (DBMS) ensures that database


transactions are executed reliably and follow ACID properties: Atomicity, Consistency, Isolation, and
Durability. These principles help maintain data integrity, even during failures or concurrent user
interactions, ensuring that all transactions are either fully completed or rolled back if errors occur.

For a transaction to be performed in DBMS, it must possess several properties often called ACID
properties.

• A - Atomicity

• C - Consistency

• I - Isolation

• D - Durability

Transaction States

Transactions can be implemented using SQL queries and Servers. In the diagram, you can see
how transaction states work.
Transaction States in
DBMS

Transaction States

The transaction has four properties. These are used to maintain consistency in a database, before
and after the transaction.
Property of Transaction:

• Atomicity

• Consistency

• Isolation

• Durability

Atomicity

• States that all operations of the transaction take place at once if not, the transactions are
aborted.

• There is no midway, i.e., the transaction cannot occur partially. Each transaction is treated as
one unit and either run to completion or is not executed at all.

• Atomicity involves the following two operations:

• Abort: If a transaction stops or fails, none of the changes it made will be saved or visible.

• Commit: If a transaction completes successfully, all the changes it made will be saved and
visible.
Consistency

• The rules (integrity constraint) that keep the database accurate and consistent are followed
before and after a transaction.

• When a transaction is completed, it leaves the database either as it was before or in a new
stable state.

• This property means every transaction works with a reliable and consistent version of the
database.

• The transaction is used to transform the database from one consistent state to another
consistent state. A transaction changes the database from one consistent state to another
consistent state.

Isolation

• It shows that the data which is used at the time of execution of a transaction cannot be used
by the second transaction until the first one is completed.

• In isolation, if the transaction T1 is being executed and using the data item X, then that data
item can't be accessed by any other transaction T2 until the transaction T1ends.

• The concurrency control subsystem of the DBMS enforced the isolation property

Durability

• The durability property is used to indicate the performance of the database's consistent
state. It states that the transaction made the permanent changes.

• They cannot be lost by the erroneous operation of a faulty transaction or by the system
failure. When a transaction is completed, then the database reaches a state known as the
consistent state. That consistent state cannot be lost, even in the event of a system's failure.

• The recovery subsystem of the DBMS has the responsibility of Durability property.

You might also like