Advance Database Management(MSBTE) (Parallel & Distributed Database System) ….Page No.
(2-1)
Unit II
Parallel and Distributed Database
CHAPTER 2 System
K-SCHEME SYLLABUS
2.1 Introduction to parallel Systems: Parallel data base system architecture, Measure of Performance-Throughput,
Response time, scale up and speed up
2.2 Introduction to distributed database, Types of Distributed Database Systems, Benefits of distributed database
system, Advantages and Disadvantages of Distributed Database
2.3 Transaction Processing in Parallel and Distributed Database Systems.
(MSBTE-K-SCHEME-Sem 5-Comp/IT)-Academic Year 2025-2026) Tech-Neo Publications
Advance Database Management(MSBTE) (Parallel & Distributed Database System) ….Page No. (2-2)
2.1 INTRODUCTION TO PARALLEL 2. Working of Parallel Database Systems
SYSTEMS IN DBMS How Parallel Processing Works?
Parallel databases process queries using two types
1. What is a Parallel System?
of parallelism:
Definition: 1. Inter-Query Parallelism – Multiple queries run
A parallel system in DBMS involves multiple simultaneously.
processors and storage working together to
process database operations simultaneously. Example : A banking system processing
This design increases performance, efficiency, withdrawals and deposits at the same time.
and scalability. 2. Intra-Query Parallelism – A single query is
Why Use Parallel Systems? divided into smaller tasks and executed in parallel.
Traditional single-processor databases can Example : A SELECT query with a JOIN
become slow when handling large amounts of operation being split across multiple processors.
data.
2.3 TYPES
Parallel databases distribute the workload across
multiple processors, increasing speed and There are four different architectural models of
efficiency. parallel database.
Example: A banking system that processes 1. Shared memory 2. Shared disk
thousands of transactions per second can use a
3. Shared nothing 4. Hierarchical
parallel database system to handle multiple
transactions simultaneously. 2.3.1 Shared Memory Architecture
2.2 PARALLEL DATABASE SYSTEM Concept
ARCHITECTURE All processors share a common memory pool and
disk storage.
1. What is a Parallel Database System?
Each processor can directly access the same
A Parallel Database System is a database system
memory and database.
that uses multiple processors and multiple disks to
execute queries and transactions in parallel. It Communication between processors happens
enhances performance, scalability, and fault through shared memory.
tolerance by distributing tasks across multiple In this architecture of parallel database, common
computing resources. memory is shared among the multiple processors.
Why Use Parallel Database Systems? These processors are connected through the
interconnection network to the main memory and
Improves query execution speed by dividing disk.
tasks.
The connection used in this architecture is usually
Handles large-scale data processing efficiently. high speed network connection which makes data
Supports high concurrent user transactions. sharing easy.
Reduces the risk of system failures through Shared memory architecture have large amount of
distributed processing. cache memory at each processors.
Example: A banking system where thousands of
transactions are processed at the same time using
multiple processors instead of one.
(MSBTE-K-SCHEME-Sem 5-Comp/IT)-Academic Year 2025-2026) Tech-Neo Publications
Advance Database Management(MSBTE) (Parallel & Distributed Database System) ….Page No. (2-3)
Fig. 2.3.2 : Shared disk architecture
Fig. 2.3.1 : Shared Memory Architechture
Advantages
Advantages
1. Better Scalability than Shared Memory :
1. Fast Communication : Processors access shared Additional processors can be added with
memory quickly. independent memory.
2. Simple Data Management : Since all processors 2. Improved Fault Tolerance : If one processor
access the same data, no partitioning is needed. fails, others can still access the disk.
3. Efficient for Small Systems : Works well for 3. Efficient for Medium-Scale Applications :
databases with limited data and few users. Useful when storage needs to be centralized.
Disadvantages
Disadvantages
1. Scalability Issues : Adding more processors 1. Disk Contention : Multiple processors accessing
increases memory contention. the same disk can create bottlenecks.
2. Memory Bottleneck : Too many processors 2. Expensive Interconnection : Requires high-
accessing shared memory can slow down speed network connections.
performance.
3. Data Synchronization Complexity : Maintaining
3. Single Point of Failure : If memory or disk fails, consistency among processors can be difficult.
the entire system goes down
2.3.3 Shared Nothing Architecture
2.3.2 Shared Disk Architecture
Concept
Concept
Each processor has its own private memory and
Each processor has its own private memory, but disk storage.
all processors share a common disk storage.
There is no shared storage or memory – each
Processors communicate using a high-speed processor manages its own data independently.
interconnection network.
Data is partitioned across different nodes, and
In the shared disk architecture of parallel database processors coordinate query execution.
system, single disk is shared among all the
processors.
These all processors also have their own private
memory which makes data sharing efficient
(MSBTE-K-SCHEME-Sem 5-Comp/IT)-Academic Year 2025-2026) Tech-Neo Publications
Advance Database Management(MSBTE) (Parallel & Distributed Database System) ….Page No. (2-4)
Organized in multiple levels:
o At the top level, processors share memory.
o At the middle level, processors share disk
storage.
o At the bottom level, processors may have their
own private memory and disks.
Hierarchical model architecture is also known as
Non Uniform Memory Architecture. This is
nothing but the combination of shared disk,
Fig. 2.3.3 : Shared nothing architechture shared memory and shared nothing architecture.
Consider there are two groups of processors say A
The shared nothing architecture is a distributed
and B. All processors from both the groups have
computing architecture in which each node is
local memory. But processors from other groups
independent. More specifically, none of the nodes
can access memory which is associated with the
share memory or disk storage.
other group in coherent.
In this architecture each processor has its own
local memory and local disk.
Intercommunication channel is used by the
processors to communicate.
The processors can independently act as a server
to serve the data of local disk.
Advantages of shared nothing architecture
1. This architecture is scalable regarding the number
of processors. Increase in their number is easy and
flexible.
2. When nodes get added transmission capacity Fig. 2.3.4 : Hierarchial architechture
increases.
Advantages of Hierarchical Architecture
3. It is a read only database and decision support 1. The availability of memory is more in this architecture.
application.
4. In this architecture failure is local. It means that
failure of one node can not affect to other nodes. 2. The scalability of system is also more.
Disadvantages of shared nothing architecture
1. The cost of communication is higher than shared
memory and shared disk architecture. Disadvantage of Hierarchical Architecture
2. The data sending involves the software 1. This architecture is costly as compared to other architectures.
interaction.
3. In this technique more coordination is required.
2.3.4 Hierarchical Architecture 2.4 MEASURES OF PERFORMANCE IN
PARALLEL DATABASE SYSTEMS
Concept
A combination of shared memory, shared disk, The performance of a Parallel Database System
and shared nothing architectures. is evaluated using four key metrics:
(MSBTE-K-SCHEME-Sem 5-Comp/IT)-Academic Year 2025-2026) Tech-Neo Publications
Advance Database Management(MSBTE) (Parallel & Distributed Database System) ….Page No. (2-5)
1. Throughput 2. Response Time
3. Scaleup 4. Speedup
Each of these metrics helps in understanding how
well a parallel database system handles query
processing, transaction execution, and system
scalability.
(MSBTE-K-SCHEME-Sem 5-Comp/IT)-Academic Year 2025-2026) Tech-Neo Publications
Advance Database Management(MSBTE) (Parallel & Distributed Database System) ….Page No. (2-6)
2.4.1 Throughput Importance
Definition
Lower response time improves user experience.
Throughput refers to the number of transactions Critical for real-time applications like online
or queries processed per unit of time. banking.
It is usually measured in transactions per second Factors Affecting Response Time
(TPS) or queries per second (QPS). Query complexity: Simple queries execute faster
Formula than complex joins.
Throughput = System load: More users can slow down response
Example times.
A system processing 500 transactions in 10 Indexing and optimization: Well-indexed
seconds has a throughput of 50 TPS. databases return results faster.
Importance
Higher throughput means better system
2.4.3 Scaleup
efficiency. Definition
Useful for comparing different parallel Scaleup measures how well a system maintains
architectures. performance when the database size increases or
Factors Affecting Throughput when more users are added.
Number of processors: More CPUs generally A system with good scaleup handles increased
improve throughput. workloads efficiently.
Disk speed: Faster disks can handle more Types of Scaleup
read/write operations. i) Transaction Scaleup
Concurrency control: Efficient locking and Measures if a system can handle more
scheduling improve throughput. concurrent users without degrading
performance.
2.4.2 Response Time
Example : A system processing 1000
Definition transactions per second should still process
Response time is the total time taken to process a 1000 transactions per second even if the
query or transaction, from submission to result number of users doubles.
delivery. ii) Database Scaleup
It includes processing time, waiting time, and Measures if a system can process the same
communication delays. query in the same time, even when the
Formula database size increases.
Response Time = Time when result is received − Example : If a query takes 5 seconds on 100
Time when query was submitted GB data, it should take the same 5 seconds
Example on 1 TB data with a larger system.
A query submitted at 10:00:00 AM and results Formula
received at 10:00:02 AM has a response time of 2 Scaleup =
seconds. Importance
Essential for big data applications.
(MSBTE-K-SCHEME-Sem 5-Comp/IT)-Academic Year 2025-2026) Tech-Neo Publications
Advance Database Management(MSBTE) (Parallel & Distributed Database System) ….Page No. (2-7)
Helps in designing future-proof database Example
systems. 1 CPU → 10 sec
Factors Affecting Scaleup
2 CPUs → 4 sec
Parallelism efficiency: More processors should
4 CPUs → 1 sec
contribute to performance improvements.
Importance
Disk I/O capacity: Faster storage access is
Helps evaluate hardware upgrades in a database
necessary for handling large data.
system.
2.4.4 Speedup Ensures efficient resource utilization in parallel
processing.
Definition
Factors Affecting Speedup
Speedup measures how much faster a system Parallelization overhead: Communication and
runs when more processors or resources are synchronization costs can limit speedup.
added.
I/O bottlenecks: If data is stored on a slow disk,
Ideally, doubling the number of processors should additional CPUs won’t help.
double the speed.
Formula 2.4.5 Comparison of Performance
Speedup = Metrics
Example
Metric Definition Ideal Affected by
A query takes 10 seconds on 1 processor and 2.5
Performance
seconds on 4 processors.
Speedup = 10 / 2.5 = 4 (Perfect speedup). Throughput Number of Higher is CPU, disk
Types of Speedup transactions better speed,
per second concurrency
i) Linear Speedup
If adding more processors reduces execution time Response Time taken to Lower is Query
proportionally, the system has ideal speedup. Time process a better complexity,
query system load
Example:
Scaleup Ability to Should remain Parallelism
1 CPU → 10 sec
handle more constant efficiency
2 CPUs → 5 sec users/data
4 CPUs → 2.5 sec
Speedup Improvement Should be Communication
ii) Sublinear Speedup
with more linear overhead
If speedup is less than the number of processors,
processors
then overhead (communication, data transfer, or
contention) reduces efficiency. 2.6 INTRODUCTION TO DISTRIBUTED
Example: DATABASE
1 CPU → 10 sec
2 CPUs → 6 sec 2.6.1 What is a Distributed
Database?
4 CPUs → 4 sec
iii) Superlinear Speedup A Distributed Database (DDB) is a collection of
In rare cases, speedup is greater than the number interconnected databases spread across multiple
of processors, often due to better memory caching. locations, but they appear as a single database to
(MSBTE-K-SCHEME-Sem 5-Comp/IT)-Academic Year 2025-2026) Tech-Neo Publications
Advance Database Management(MSBTE) (Parallel & Distributed Database System) ….Page No. (2-8)
users. These databases are stored on different
servers, often in different geographic locations,
and communicate through a network.
Example : A banking system where customer
accounts are stored in different cities, but users
can access their accounts from any branch.
Fig. 2.6.2 : Homogenous Distributed Database
System
Example
A retail chain where all store databases run
MySQL and have the same schema.
Advantages
1. Simple data management since all databases
follow the same schema.
2. Easier query execution since all sites use the same
query language.
3. Easier data replication and consistency
maintenance.
Disadvantages
1. Limited flexibility; difficult to integrate different
Fig. 2.6.1 : Distributed Database System databases.
2.6.2 Types of Distributed Database 2. A failure in one site may affect other sites due to
Systems dependency.
2) Heterogeneous Distributed Database System
A Distributed Database System (DDBS) Different databases use different DBMS
consists of databases stored across multiple physical software or data structures.
locations, but they appear as a single database to
Requires data translation and integration for
users. There are different types of distributed databases
seamless access.
based on how data is stored, managed, and accessed.
A special component called a middleware or
2.6.2(I) Classification Based on global schema manages communication
System Architecture between different systems.
1) Homogeneous Distributed Database System
All databases use the same DBMS software
and data structure.
The system operates as if it were a single
database with seamless data integration.
Easier to manage because all sites follow the
same rules.
(MSBTE-K-SCHEME-Sem 5-Comp/IT)-Academic Year 2025-2026) Tech-Neo Publications
Advance Database Management(MSBTE) (Parallel & Distributed Database System) ….Page No. (2-9)
Fig. 2.6.3 : Heterogenous Distributed Disadvantages
Database System
1. High storage cost : Requires additional space for
Example
multiple copies.
A global e-commerce platform where orders are
2. Synchronization overhead : Keeping all copies
stored in Oracle, customer data in MySQL, and
up to date consumes network bandwidth.
product details in MongoDB.
4. Distributed Database with Data
Advantages Fragmentation
1. Can integrate legacy systems and modern Instead of storing full copies, the database is split
databases. into smaller logical parts (fragments) that are
2. Greater flexibility since different systems can distributed across different locations. Each fragment
coexist. contains only a subset of the entire database.
3. Can optimize different databases for specific tasks Types of Fragmentation
(e.g., NoSQL for unstructured data). 1. Horizontal Fragmentation : Different rows
Disadvantages (tuples) of a table are stored at different locations.
1. Requires complex data transformation and 2. Vertical Fragmentation : Different columns
synchronization. (attributes) of a table are stored at different
2. Query execution is slower due to differences in locations.
DBMS architectures. 3. Hybrid Fragmentation : A combination of
horizontal and vertical fragmentation.
2.6.2(II) Classification Based on Data
Example
Distribution
A banking system where:
3. Distributed Database with Data Replication
Customer data for Branch A is stored in City A.
In this model, multiple copies of the same data
Customer data for Branch B is stored in City B.
are stored at different locations. Any update to one
copy must be synchronized with other copies to Common customer details (e.g., name, account
maintain data consistency. number) are stored centrally.
Example Advantages:
A social media platform where user profile data 1. Efficient query execution : Only relevant data is
is replicated across multiple servers worldwide to retrieved from nearby locations.
ensure faster access and availability. 2. Reduced network traffic : Only necessary
Types of Replication: fragments are transmitted.
1. Full Replication : The entire database is copied at 3. Better security : Sensitive data can be stored in
each site. secure locations.
2. Partial Replication : Only a subset of data is Disadvantages
replicated at different locations. 1. Complex query processing : Requires data
Advantages reconstruction from multiple fragments.
1. Improved availability : Data remains accessible 2. Data inconsistency risks if synchronization is not
even if some servers fail. properly managed.
2. Faster query response : Users access data from
the nearest replica. 2.6.2 (III) Classification Based on
Communication Model
3. Fault tolerance : If one server crashes, others still
provide access. 5. Client-Server Distributed Database
(MSBTE-K-SCHEME-Sem 5-Comp/IT)-Academic Year 2025-2026) Tech-Neo Publications
Advance Database Management(MSBTE) (Parallel & Distributed Database System) ….Page No. (2-10)
A central database server manages all data
and processes requests from multiple clients.
Clients do not store data but send queries to
the server for execution.
FIG 2.6.5 Peer to peer distributed database
Example
A blockchain-based system where each
participant maintains a copy of the ledger, and
transactions are validated by peers.
Advantages
1. No single point of failure – The system remains
operational even if some nodes fail.
2. Highly scalable – More nodes can be added
easily.
Fig. 2.6.4 : Client Server Distributed Database Disadvantages
System
1. Data consistency challenges due to decentralized
Example control.
A university student portal where students 2. Synchronization complexity among nodes.
access course materials stored in a central database 7. Federated (Multi-Database) System
server.
Independent databases work together but
Advantages
maintain their autonomy.
1. Easier management : All data is stored centrally.
A federated schema integrates different
2. Better security : Access control is enforced by databases into a unified view.
the server.
Disadvantages
1. Single point of failure : If the server crashes, all
clients lose access.
2. Performance bottleneck : Too many client
requests can slow down the system.
6. Peer-to-Peer (P2P) Distributed Database
No central authority : Each node acts as
both a client and a server.
Data is shared among multiple nodes,
which communicate directly.
Fig. 2.6.6 : Federated (Multi-Database) System
(MSBTE-K-SCHEME-Sem 5-Comp/IT)-Academic Year 2025-2026) Tech-Neo Publications
Advance Database Management(MSBTE) (Parallel & Distributed Database System) ….Page No. (2-11)
Example Parallel processing : Multiple sites can process
A travel booking system where: queries simultaneously, leading to faster
Airline reservations are managed by an airline execution.
database. Example
Hotel bookings are handled by a hotel database. A global e-commerce website stores customer
Car rentals are stored in a rental database. data in regional databases, allowing faster product
Advantages searches and checkout processes.
1. Allows integration of different databases without 2. High Availability and Reliability
modifying them. Distributed databases provide higher availability
2. Flexibility : Each database operates by ensuring that data is accessible even if some servers
independently while still being accessible. fail.
Disadvantages How It Ensures High Availability
1. Complex query processing – Requires 1. Data replication : Multiple copies of the same
middleware to unify different databases. data exist on different servers, ensuring access
even if one server is down.
2. Data security and access control challenges due
to different policies across systems. 2. Fault tolerance : If one site fails, another site can
handle requests without disrupting operations.
2.7 BENEFITS OF A DISTRIBUTED 3. Load balancing : Workloads are distributed
DATABASE SYSTEM (DDBS) across multiple servers, preventing system
overload.
UQ. Write any two benefits of distributed
database system.
Example
(Q. 1(B), W-19, 2 Marks) A banking system ensures that customer
A Distributed Database System (DDBS) transactions remain accessible even if one data center
provides numerous advantages over traditional experiences a power outage.
centralized databases. By distributing data across 3. Scalability
multiple locations, a DDBS improves A distributed database can easily scale as the
performance, scalability, reliability, and number of users and data volume increases.
availability while maintaining data consistency How It Supports Scalability
and security.
1. Horizontal scaling : New database nodes can be
Below are the key benefits of using a Distributed added without affecting existing operations.
Database System :
2. Flexible expansion : Organizations can store
1. Improved Performance
growing data across multiple locations instead of
In a distributed system, data is stored closer to overloading a central database.
the users who need it, which reduces network delays
3. Optimized resource usage : Different sites can
and improves response time.
handle different workloads efficiently.
How it Improves Performance
Example : A social media platform can add new
Local query processing : Queries are executed database servers in different regions as user
on local databases rather than relying on a central numbers grow.
database, reducing processing time. 4. Reduced Network Load
Reduced network congestion : Since data is A distributed database reduces network traffic
processed locally, the amount of data transmitted by allowing local queries instead of sending all
over the network is minimized. requests to a central server.
(MSBTE-K-SCHEME-Sem 5-Comp/IT)-Academic Year 2025-2026) Tech-Neo Publications
Advance Database Management(MSBTE) (Parallel & Distributed Database System) ….Page No. (2-12)
How It Reduces Network Load Data encryption – Protects data from
Fragmentation – Data is stored in smaller parts unauthorized access during transmission.
at different locations, minimizing data transfer Example
over the network.
A government database stores citizen records
Local query execution – Users can access and with restricted access based on roles (e.g., tax data is
process data locally, reducing the need for global accessible only by the tax department).
communication. 7. Cost-Effectiveness
Efficient data replication – Only necessary Distributed databases reduce operational costs by
updates are transmitted, rather than full database using multiple smaller servers instead of one
copies. expensive central system.
Example How It Reduces Costs
A supply chain management system stores Uses low-cost hardware – Instead of relying on a
inventory details at regional warehouses, reducing the high-performance central server, multiple cost-
need for central data requests. effective servers can be used.
5. Data Integrity and Consistency
Efficient resource utilization – Different sites
A well-designed distributed database ensures data handle different workloads, preventing the need
consistency across all sites, even when updates occur for expensive upgrades.
at multiple locations.
Reduces data transmission costs – By storing
How It Ensures Data Integrity
data locally, organizations save on network
1. Concurrency control protocols – Prevents bandwidth costs.
conflicts when multiple users update the same
Example
data simultaneously.
2. Transaction management – Ensures that A multinational corporation uses distributed
distributed transactions are processed correctly. databases in each country, avoiding the need for a
high-cost global data center.
3. Synchronized replication – Updates are
8. Support for Distributed Applications
propagated to all copies, ensuring uniform data.
Modern applications require real-time data
Example
access from multiple locations. A distributed
A hospital management system maintains database enables seamless integration with such
accurate patient records across different hospitals applications.
without inconsistencies. How It Supports Distributed Applications
6. Enhanced Data Security and Privacy
Multi-location access : Users from different
A distributed database allows organizations to
locations can access relevant data without delay.
implement better security policies by controlling
access at different locations. Improved collaboration : Teams can work on
shared data simultaneously without conflicts.
How It Improves Security
Remote work support : Employees can access
Decentralized data storage – Sensitive company data securely from different offices.
information can be stored in specific locations for
Example : A cloud-based CRM system allows
better control.
sales teams from different countries to access and
Access control mechanisms – Users can only update customer information in real time.
access data relevant to their roles and location.
(MSBTE-K-SCHEME-Sem 5-Comp/IT)-Academic Year 2025-2026) Tech-Neo Publications
Advance Database Management(MSBTE) (Parallel & Distributed Database System) ….Page No. (2-13)
2.9 DISADVANTAGES OF 4. Increased Security Risks
DISTRIBUTED DATABASE
Data transmission between sites increases
SYSTEMS
vulnerability to cyber threats, such as
hacking and data breaches.
1. Complexity in Database Design and
Management
More locations mean more entry points for
security threats.
Designing a distributed database is more
complex than a centralized one due to data Example : A financial institution must
fragmentation, replication, and secure transactions across multiple data
synchronization. centers to prevent unauthorized access.
Query processing is difficult because data is 5. Communication Overhead
spread across multiple sites. Frequent communication between database
Example : A university student database sites can increase network latency and
must manage student records across multiple affect performance.
campuses while ensuring consistency. Distributed transactions require extra
2. Higher Costs for Setup and Maintenance coordination, leading to higher processing
time.
Initial setup costs are higher due to the need
for multiple database servers and network Example : An online multiplayer game
infrastructure. requires real-time data synchronization
across multiple servers, leading to potential
Ongoing maintenance requires specialized
delays.
administrators, increasing operational costs.
6. Risk of Network Failures
Example : A telecom company must invest
in multiple database servers and a high-speed If the network fails, distributed sites cannot
communicate, leading to possible data
network to maintain a distributed billing
unavailability.
system.
Network partitioning can create issues
3. Data Synchronization and Consistency Issues
where certain data is temporarily
Maintaining data consistency across multiple inaccessible.
sites is challenging, especially with Example : A cloud storage service may
replication and concurrent updates. face temporary outages if a major network
Conflicts may arise when two or more users failure disrupts data synchronization between
update the same data from different data centers.
locations. 7. Difficulty in Backup and Recovery
Example Backups are more complex since they must
A global airline reservation system must capture distributed data across multiple sites.
ensure that a booked seat is not assigned to Data recovery is challenging, especially if
multiple customers due to synchronization different sites experience failures at different
delays. times.
Example : A hospital system that loses part
of a patient’s medical history due to
incomplete backups in different locations.
(MSBTE-K-SCHEME-Sem 5-Comp/IT)-Academic Year 2025-2026) Tech-Neo Publications
Advance Database Management(MSBTE) (Parallel & Distributed Database System) ….Page No. (2-14)
2.10 TRANSACTION PROCESSING IN 3. Intra-Transaction Parallelism : A single
PARALLEL AND DISTRIBUTED transaction is split into multiple operations that
DATABASE SYSTEMS execute in parallel.
Example : A bank transfer updates multiple
Transaction processing plays a crucial role in
accounts across different branches simultaneously.
Parallel and Distributed Database Systems to ensure
data consistency, reliability, and efficiency across 2.10.3 Concurrency Control in
multiple nodes. Although both systems aim for Parallel Databases
improved performance, they handle transactions
differently based on their architecture and objectives. Since multiple transactions execute at the same
time, proper mechanisms are required to maintain data
2.10.1 Transaction Processing in integrity.
Parallel Database Systems
Concurrency Control Methods in Parallel Systems:
A Parallel Database System uses multiple Lock-Based Protocols : Multiple transactions
processors and storage units to execute transactions acquire locks on data to prevent conflicts.
simultaneously. The goal is to improve performance, Timestamp-Based Protocols : Transactions
speed, and efficiency by distributing the workload execute based on assigned timestamps to avoid
across multiple processing units. conflicts.
Characteristics of Parallel Transaction Processing:
Optimistic Concurrency Control (OCC) :
1. Parallel Execution : Transactions are divided into Transactions execute without locks but are
smaller sub-tasks executed simultaneously. validated before committing.
2. Data Partitioning : Large databases are split Advantages of Parallel Transaction Processing
across multiple storage units to speed up 1. High Performance & Speed : Transactions
transactions. complete faster due to parallel execution.
3. Concurrency Control : Ensures multiple 2. Efficient Resource Utilization : Distributes
transactions can execute in parallel without workload across multiple processors.
conflicts.
3. Supports Large Databases : Handles high-
4. Load Balancing : Distributes transactions across volume transactions efficiently.
processors to prevent bottlenecks.
Challenges
2.10.2 Types of Parallel Transaction 1. Data Synchronization Issues : Ensuring all
Processing parallel processes work with consistent data.
1. Inter-Query Parallelism : Different queries 2. Concurrency Conflicts : Handling simultaneous
execute simultaneously on different processors. updates to the same data.
Example: Multiple users running different search 3. High Hardware Cost : Requires specialized
queries on an online shopping platform. parallel hardware.
2. Intra-Query Parallelism : A single query is 2.10.4 Transaction Processing in
divided into smaller tasks that execute in parallel. Distributed Database Systems
Example : A complex analytics query runs on
A Distributed Database System (DDBS) processes
multiple nodes to speed up processing.
transactions across multiple geographically distributed
(MSBTE-K-SCHEME-Sem 5-Comp/IT)-Academic Year 2025-2026) Tech-Neo Publications
Advance Database Management(MSBTE) (Parallel & Distributed Database System) ….Page No. (2-15)
databases. The main challenge is ensuring atomicity, 2.10.6 Commit Protocols in
consistency, isolation, and durability (ACID) while Distributed Transactions
handling transactions across different sites.
A distributed transaction must either commit
Characteristics of Distributed Transaction
across all sites or abort completely to ensure
Processing:
consistency.
1. Global Transaction Management : Transactions
1. Two-Phase Commit (2PC) Protocol
span multiple sites and require coordination.
Phase 1 (Prepare Phase): Each site prepares
2. Commit Protocols : Ensures distributed
transactions either commit or abort across all sites. and agrees to commit the transaction.
3. Data Replication : Copies of the same data exist Phase 2 (Commit Phase): If all sites agree,
across multiple sites for reliability. the transaction commits; otherwise, it aborts.
4. Failure Recovery Mechanisms : Handles partial Example: An e-commerce system processes
failures in a distributed environment. an order by checking inventory (Site A),
payments (Site B), and shipping (Site C)
Types of Transactions in Distributed Databases:
before committing.
1. Local Transactions : Execute at a single site
2. Three-Phase Commit (3PC) Protocol
without needing coordination.
An improved version of 2PC that prevents
Example : A university database updates a
indefinite blocking by adding a pre-commit
student’s records at one campus.
phase before finalizing.
2. Global Transactions : Involve multiple sites and
Example: A flight booking system ensures
require coordination.
seat reservation, payment, and ticket
Example : A bank transaction transferring generation occur consistently across multiple
money between accounts in different countries. servers.
2.10.5 Concurrency Control in Advantages of Distributed Transaction Processing
Distributed Systems
1. Higher Availability : Transactions continue even
Concurrency control ensures that transactions if some sites fail.
execute correctly and consistently across different 2. Load Distribution : Transactions are distributed
sites. across multiple databases, reducing overload.
Methods Used in Distributed Transactions 3. Fault Tolerance : Ensures reliability through
Distributed Two-Phase Locking (2PL) : Uses replication and failure recovery.
locks across multiple sites to prevent conflicts. Challenges of Distributed Transaction Processing:
Distributed Timestamp Ordering : Ensures 1. Network Latency – Increased communication
transactions execute in the correct order based on time between sites can slow down transactions.
timestamps. 2. Deadlocks – Multiple transactions competing for
Optimistic Concurrency Control (OCC) : the same data may lead to deadlocks.
Transactions execute without restrictions and 3. Security Risks – Data transmitted across different
validate before committing. sites is vulnerable to cyber threats.
(MSBTE-K-SCHEME-Sem 5-Comp/IT)-Academic Year 2025-2026) Tech-Neo Publications
Advance Database Management(MSBTE) (Parallel & Distributed Database System) ….Page No. (2-16)
2.10.7 Comparison of Parallel vs. Distributed Transaction Processing
UQ. Differentiate between parallel & distributed databases. (Q-3,D , W-19, 4 Marks)
Feature Parallel Database Transactions Distributed Database Transactions
Execution Single system with multiple processors Multiple geographically distributed databases
Scope
Concurrency High due to shared memory Moderate due to network delays
Commit Local commit only Uses Two-Phase or Three-Phase Commit
Protocol
Fault Limited (depends on hardware High (data replication across sites)
Tolerance redundancy)
Performance Faster execution due to parallelism Slower due to network communication
Complexity Moderate High (requires coordination across multiple
sites)
Chapter Ends…
(MSBTE-K-SCHEME-Sem 5-Comp/IT)-Academic Year 2025-2026) Tech-Neo Publications