Parallel Database Systems and Their Architecture

Uploaded by

vinashreemeshram

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views

Parallel Database Systems and Their Architecture

Uploaded by

vinashreemeshram

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

Parallel Database Systems and Their Architecture

• A parallel database system is designed to

distribute data and workload across multiple
processors or nodes in order to improve
performance, scalability, and fault tolerance.
The goal is to handle large datasets and
complex queries efficiently by executing
operations in parallel across multiple
computing resources.
Types of Parallelism in Database Systems

• Parallelism in database systems can be

categorized into two main types:
• Data Parallelism: Data is split across multiple
processors, with each processor handling a
portion of the data.
• Task Parallelism: Different tasks (queries or
operations) are executed in parallel, allowing
multiple queries to run simultaneously.
Key Concepts
• Shared-Nothing Architecture: In this architecture, each node
in the system has its own memory, CPU, and disk storage. No
resources are shared, and nodes communicate with each
other over a network. This is the most scalable approach.
• Shared-Disk Architecture: In this setup, all nodes have
access to a common disk storage, but they still have separate
processors and memory. Data is not replicated, but it is
accessible by all nodes. A shared-disk system tends to scale
well for workloads that require frequent access to large
datasets.
• Shared-Memory Architecture: Here, all nodes share a single
physical memory. Each node typically has its own processor,
but there is a centralized memory that all nodes can access.
This architecture is less scalable than shared-nothing but can
be simpler to implement.
Parallel Database Architectures
• Parallel databases can be implemented using various
architectures, depending on the level of parallelism and the
specific requirements of the application:
• 1. Parallel Database Architectures Based on Data Partitioning
• Horizontal Partitioning (or Range-based Partitioning): The
data is divided into chunks based on specific ranges of values.
For example, a table containing customer data can be
partitioned by customer ID, with each range handled by a
different node.
• Vertical Partitioning: The columns of the database are
distributed across different nodes. This is useful when only a
subset of columns is frequently accessed for specific queries.
• Hybrid Partitioning: A combination of horizontal and vertical
partitioning, allowing for more flexible and optimized data
distribution.
Distributed database
• A distributed database is a collection of data
that is spread across multiple locations, which
could be on different computers, servers, or
geographical regions. Despite its distribution,
the system appears to users as a single unified
database.
Key Features of Distributed Databases:

• Data Distribution:
– Data is stored across multiple physical locations.
– The distribution can be homogeneous (same software and
schema) or heterogeneous (different software or schema).
• Transparency:
– Location Transparency: Users don’t need to know where
data is stored.
– Replication Transparency: Users are unaware of data
replication across nodes.
– Fragmentation Transparency: Users don’t need to worry
about how data is divided across sites.
• Scalability:
• Can handle growing data and user load by adding
more nodes to the system.
• Reliability and Availability:
• If one site fails, the system can continue operating
using other sites.
• Replication ensures data availability.
• Concurrency Control:
• Ensures consistency when multiple users access
the database simultaneously.
• Fault Tolerance:
• Uses mechanisms like replication and recovery
protocols to handle node or system failures.
Types of Distributed Databases:

• Homogeneous Distributed Databases:

– All sites use the same DBMS.
– Simplifies system design but offers less flexibility.
• Heterogeneous Distributed Databases:
– Sites may use different DBMS and schema.
– Requires middleware to manage differences.
• Distributed Data Storage:
– Fragmentation: Dividing data into smaller pieces (fragments).
– Replication: Storing copies of data on multiple sites.
– Combination: Mixing both techniques.
Advantages:

• Improved reliability and availability.

• Faster query responses due to localized data.
• Scalability for larger data volumes.
• Supports collaboration across geographically
dispersed teams.
Distributed Database Architecture
• Distributed Database Architecture in a
Database Management System (DBMS) refers
to the design and structure of how data is
distributed and managed across multiple sites
or nodes. The architecture determines how
the distributed database appears to users,
how the data is stored, and how operations
like querying and transactions are handled
across different locations.
Parallel Query Processing

• Query execution in parallel databases is optimized using multiple

techniques:
• Data Parallelism: Large queries can be broken down into smaller sub
queries, each executed on different processors. For instance, an
aggregate function like SUM or COUNT can be computed in parallel
across partitions of the data.
• Pipeline Parallelism: Queries are broken down into stages, with each
stage running in parallel on different processors. For example, one
processor might perform a JOIN, while another performs a SELECT
operation, and another computes an aggregate.
• Operator Parallelism: Specific query operators such as JOIN, GROUP
BY, or SORT can be executed in parallel across multiple processors or
nodes.
Load Balancing and Query Scheduling

• Efficient load balancing is crucial for parallel databases

to ensure that all processors are working at optimal
capacity. The system may use:
• Workload Distribution: Queries are divided into smaller
tasks, and these tasks are distributed across available
processors in such a way that each processor performs
roughly the same amount of work.
• Dynamic Scheduling: The database management
system may dynamically adjust how tasks are assigned
to nodes based on the current workload or resource
availability.
Example of Parallel Database Systems

• Google Bigtable: A distributed storage system designed to manage large

amounts of data across many machines. It allows parallel access to data by
partitioning tables across nodes in a way that supports both data
parallelism and task parallelism.
• Amazon Redshift: A fully managed, petabyte-scale data warehouse
solution that uses columnar storage and parallel processing to enable fast
query performance on large datasets.
• Apache Hadoop and HBase: Hadoop’s distributed file system (HDFS) and
HBase provide a distributed, parallel environment for processing large
datasets, with HBase being the NoSQL database designed for scalable,
real-time data access.
• Teradata: Teradata uses a massively parallel processing (MPP) architecture
where each node has its own storage and computes queries in parallel
across many nodes, optimizing performance for large-scale data analytics.
Architecture Components of a Parallel Database System

• Processing Nodes:
– These are the CPU resources that execute the queries in
parallel. Depending on the architecture, each node may be
assigned a portion of data or a specific task.
• Storage System:
– The storage system can be implemented using shared-
nothing, shared-disk, or distributed file systems. In the case
of shared-nothing architectures, each node typically has its
own local disk, while in shared-disk systems, the disk storage
is centralized.
• Interconnect Network:
– The network is crucial for communication between nodes. In
shared-nothing architectures, this network handles data
transfers for query results and intermediate computations.
• Query Processor:
– The query processor is responsible for parsing, optimizing, and
executing SQL queries. In parallel databases, it may also include a
parallel execution engine that breaks down queries into tasks that
can be distributed across the nodes.
• Transaction Manager:
– Manages transactions in a parallel database, ensuring that data
consistency and isolation properties are maintained even when data
is spread across multiple nodes.
• Scheduler:
– In parallel databases, the scheduler decides which tasks are
assigned to which processing nodes, and it manages the parallel
execution of queries and data processing.
Advantages of Parallel Database Systems
• Performance: Parallelism allows for the simultaneous
execution of queries and data processing, significantly
reducing query response times, especially for large
datasets.
• Scalability: Parallel systems can handle growing data
volumes by simply adding more processing nodes, which
increases both data storage and computational power.
• Fault Tolerance: By distributing data across multiple
nodes, the system can tolerate the failure of individual
nodes without losing data or interrupting service.
• Improved Throughput: Multiple queries can be executed
concurrently, leading to higher system throughput,
especially in multi-user environments.

PMP Exam Prep - 2023 11th Edition (Rita Mulcahy, PMP With Margo Kirwin)
92% (63)
PMP Exam Prep - 2023 11th Edition (Rita Mulcahy, PMP With Margo Kirwin)
456 pages
Cambridge IGCSE® Mathematics Core and Extended-2022
82% (61)
Cambridge IGCSE® Mathematics Core and Extended-2022
891 pages
Read People Like A Book by Patrick King-Edited
62% (65)
Read People Like A Book by Patrick King-Edited
12 pages
Cambridge Science Year 7 LB Lyp
85% (174)
Cambridge Science Year 7 LB Lyp
344 pages
Cambridge Primary Science Year 8 LB 2nd Edition
88% (85)
Cambridge Primary Science Year 8 LB 2nd Edition
333 pages
Igcse Maths 3ed Coursebook Answers
85% (81)
Igcse Maths 3ed Coursebook Answers
219 pages
Cambridge IGCSE™ and O Level Additional Mathematics Coursebook (Sue Pemberton Writer) (Z-Library)
97% (32)
Cambridge IGCSE™ and O Level Additional Mathematics Coursebook (Sue Pemberton Writer) (Z-Library)
496 pages
Business Plan Template: Professional Business Plan
81% (678)
Business Plan Template: Professional Business Plan
26 pages
Penis Enlargement Secret
61% (123)
Penis Enlargement Secret
12 pages
Workbook Answers: Unit 1 Respiration
88% (223)
Workbook Answers: Unit 1 Respiration
30 pages
Most Common Interview Questions and Answers
93% (27)
Most Common Interview Questions and Answers
3 pages
50 Most Common Interview Questions and Answers
95% (43)
50 Most Common Interview Questions and Answers
40 pages
1500 Vocabulary Words
80% (64)
1500 Vocabulary Words
27 pages
How To Win Friends and Influence People (PDFDrive)
99% (68)
How To Win Friends and Influence People (PDFDrive)
215 pages
English 10 Quarter 1 Module 1
88% (246)
English 10 Quarter 1 Module 1
32 pages
The Art of Saying NO
96% (26)
The Art of Saying NO
138 pages
How To Write A GREAT Business Plan
87% (244)
How To Write A GREAT Business Plan
31 pages
The Art of War
90% (21)
The Art of War
300 pages
Ielts Writing Task 1
95% (91)
Ielts Writing Task 1
123 pages
How To Talk To Anyone About Anything Improve Your Social Skills Master Small Talk
95% (43)
How To Talk To Anyone About Anything Improve Your Social Skills Master Small Talk
103 pages
W. Williams, James - How to Read People Like a Book_ a Guide to Speed-Reading People, Understand Body Language and Emotions, Decode Intentions, And Connect Effortlessly (Practical Emotional Intelligen (1)
95% (40)
W. Williams, James - How to Read People Like a Book_ a Guide to Speed-Reading People, Understand Body Language and Emotions, Decode Intentions, And Connect Effortlessly (Practical Emotional Intelligen (1)
100 pages
220 Speaking Topics
99% (134)
220 Speaking Topics
195 pages
Zero To One
96% (48)
Zero To One
200 pages
Job Interview Questions and Answers PDF
94% (32)
Job Interview Questions and Answers PDF
14 pages
Cambridge IGCSE & O Level Chemistry Exam Success
100% (15)
Cambridge IGCSE & O Level Chemistry Exam Success
228 pages
101 Best Microsoft Excel Tips & Tricks Ebook v1.3 - LM
100% (21)
101 Best Microsoft Excel Tips & Tricks Ebook v1.3 - LM
616 pages
HOW TO TALK TO Anyone
100% (25)
HOW TO TALK TO Anyone
174 pages
12 Rules For Life by Jordan Peterson
67% (18)
12 Rules For Life by Jordan Peterson
50 pages
Group Disc
No ratings yet
Group Disc
38 pages
650+ English Phrases For Everyday Speaking PDF
100% (29)
650+ English Phrases For Everyday Speaking PDF
50 pages
GEA C21 5.2 Introduction Manual
No ratings yet
GEA C21 5.2 Introduction Manual
30 pages
Parallel Databases
No ratings yet
Parallel Databases
23 pages
1 Distributed DB
No ratings yet
1 Distributed DB
67 pages
ADBMS Parallel and Distributed Databases
No ratings yet
ADBMS Parallel and Distributed Databases
98 pages
DDIS U1-3
No ratings yet
DDIS U1-3
40 pages
Unit 5 Parallel and Distributed Databases
No ratings yet
Unit 5 Parallel and Distributed Databases
22 pages
Lefikir PowerPoint
No ratings yet
Lefikir PowerPoint
15 pages
Dbms Architecture c
No ratings yet
Dbms Architecture c
23 pages
Tybca Recent Trends in It Chpter 1
No ratings yet
Tybca Recent Trends in It Chpter 1
16 pages
Adbms
No ratings yet
Adbms
70 pages
Database
No ratings yet
Database
6 pages
Distributed DBMS
No ratings yet
Distributed DBMS
6 pages
Second Unit ADBMS
No ratings yet
Second Unit ADBMS
53 pages
Unit - Iv Data Analytics Frameworks: Centralized and Distributed Functional Architectures of Relational Systems
No ratings yet
Unit - Iv Data Analytics Frameworks: Centralized and Distributed Functional Architectures of Relational Systems
24 pages
CH.4
No ratings yet
CH.4
16 pages
No SQL
No ratings yet
No SQL
36 pages
DTUnit 1 & 2
No ratings yet
DTUnit 1 & 2
69 pages
Big Data Storage Concepts
No ratings yet
Big Data Storage Concepts
31 pages
Unit No.4 Parallel Database
No ratings yet
Unit No.4 Parallel Database
32 pages
Types of Distributed Databases.: Homogeneous Distributed Databases System Heterogeneous Distributed Database System
No ratings yet
Types of Distributed Databases.: Homogeneous Distributed Databases System Heterogeneous Distributed Database System
22 pages
Distributed Databases , NOSQL Systems and BIGDATA
No ratings yet
Distributed Databases , NOSQL Systems and BIGDATA
62 pages
Distributed DBMS Architecture
No ratings yet
Distributed DBMS Architecture
49 pages
9.CSI2004-ADBMS_Module2__part1
No ratings yet
9.CSI2004-ADBMS_Module2__part1
54 pages
DD Sem II Answer
No ratings yet
DD Sem II Answer
17 pages
Chapter 6 Distributed System Management
No ratings yet
Chapter 6 Distributed System Management
12 pages
Unit 6 Advanced Databases
No ratings yet
Unit 6 Advanced Databases
108 pages
Distributed DB
No ratings yet
Distributed DB
16 pages
Distributed Database System
No ratings yet
Distributed Database System
9 pages
777 1651399819 BD Module 5
No ratings yet
777 1651399819 BD Module 5
75 pages
Parallel Database
No ratings yet
Parallel Database
22 pages
Unit - I Distributed Data Processing
100% (2)
Unit - I Distributed Data Processing
27 pages
Notes_1071_MCA-20-23 Unit- 4.1
No ratings yet
Notes_1071_MCA-20-23 Unit- 4.1
48 pages
Adv DBMS-Unit 2
No ratings yet
Adv DBMS-Unit 2
15 pages
Team:DBMS: by Navdeep Kaur Assistant Professor Computer Science Department
No ratings yet
Team:DBMS: by Navdeep Kaur Assistant Professor Computer Science Department
19 pages
Parallel System Study Material
No ratings yet
Parallel System Study Material
2 pages
Unit 2 DDMS
No ratings yet
Unit 2 DDMS
26 pages
University Institute of Computing: Master of Computer Applications (MCA)
No ratings yet
University Institute of Computing: Master of Computer Applications (MCA)
8 pages
DBMS - Chapter 1
No ratings yet
DBMS - Chapter 1
45 pages
Presentation
No ratings yet
Presentation
19 pages
Distributed Databases Introduction
100% (1)
Distributed Databases Introduction
16 pages
RK NoSQL
No ratings yet
RK NoSQL
35 pages
DD Class 01
No ratings yet
DD Class 01
28 pages
DBS Architecture
No ratings yet
DBS Architecture
12 pages
P24CDMCA4_unit2[1]
No ratings yet
P24CDMCA4_unit2[1]
15 pages
DDBMS
No ratings yet
DDBMS
8 pages
Module1 ADBMS
No ratings yet
Module1 ADBMS
99 pages
Advaced DB Ch1
No ratings yet
Advaced DB Ch1
50 pages
Distributed Database Management System
No ratings yet
Distributed Database Management System
5 pages
Advance Concept in Data Bases Unit-3 by Arun Pratap Singh
100% (2)
Advance Concept in Data Bases Unit-3 by Arun Pratap Singh
81 pages
Lec 1
No ratings yet
Lec 1
76 pages
L9 - Software Architecture _ Repository, Client Server
No ratings yet
L9 - Software Architecture _ Repository, Client Server
18 pages
Unit-1 Q&a
No ratings yet
Unit-1 Q&a
24 pages
Distributed Database: GDC Thana Semester 6
No ratings yet
Distributed Database: GDC Thana Semester 6
10 pages
Ch-6 Database System Architecture
No ratings yet
Ch-6 Database System Architecture
41 pages
MC4202 - Adavanced Database Technology
No ratings yet
MC4202 - Adavanced Database Technology
159 pages
Distributed Database: Database Database Management System Storage Devices CPU Computers Network
No ratings yet
Distributed Database: Database Database Management System Storage Devices CPU Computers Network
15 pages
Database Lec 1
No ratings yet
Database Lec 1
17 pages
module 2
No ratings yet
module 2
36 pages
Unit - 2 (1) DBMS
No ratings yet
Unit - 2 (1) DBMS
25 pages
Database And Computer Management: SERIES 1, #3
From Everand
Database And Computer Management: SERIES 1, #3
Elias Mutegi
No ratings yet
IELTS All in One PDF
75% (8)
IELTS All in One PDF
62 pages
Visual Display Units
No ratings yet
Visual Display Units
12 pages
Gentoo Linux AMD64 Handbook
No ratings yet
Gentoo Linux AMD64 Handbook
111 pages
QNX Neutrino RTOS - C Library Reference
No ratings yet
QNX Neutrino RTOS - C Library Reference
3,919 pages
Tle 10 (Division Template) 2nd Quarter
No ratings yet
Tle 10 (Division Template) 2nd Quarter
9 pages
My Notes
No ratings yet
My Notes
4 pages
NamPLT CV
No ratings yet
NamPLT CV
5 pages
NET201 Lab Experiment # 2 - Configuring and Troubleshooting VLANs and Trunking
No ratings yet
NET201 Lab Experiment # 2 - Configuring and Troubleshooting VLANs and Trunking
23 pages
Armed Bear Common Lisp User Manual
No ratings yet
Armed Bear Common Lisp User Manual
80 pages
Using Sprajax To Test AJAX Security
No ratings yet
Using Sprajax To Test AJAX Security
26 pages
David Fox Resume 2 Pages
No ratings yet
David Fox Resume 2 Pages
4 pages
Vista 120 Cms Pi 9069858 en
No ratings yet
Vista 120 Cms Pi 9069858 en
6 pages
Thesis Python
100% (3)
Thesis Python
5 pages
MCQ 8051
No ratings yet
MCQ 8051
5 pages
Core Module - 4 Activities
No ratings yet
Core Module - 4 Activities
52 pages
Flash LG Phone
No ratings yet
Flash LG Phone
11 pages
Apple IIe Owner's Manual
100% (1)
Apple IIe Owner's Manual
148 pages
Temp Modbus List Mexico
No ratings yet
Temp Modbus List Mexico
5 pages
Synopsis Major
No ratings yet
Synopsis Major
4 pages
A Simple Intro To The Hexadecimal System
No ratings yet
A Simple Intro To The Hexadecimal System
11 pages
VAIO F VPCF13YFX/B 16.4" Notebook Computer (Black)
No ratings yet
VAIO F VPCF13YFX/B 16.4" Notebook Computer (Black)
9 pages
Exercise Workbook For Student 11: SAP B1 On Cloud - AIS
No ratings yet
Exercise Workbook For Student 11: SAP B1 On Cloud - AIS
37 pages
GSM Robotic Arm
No ratings yet
GSM Robotic Arm
68 pages
Unit III Cloud Computing CSE 8th Semester
No ratings yet
Unit III Cloud Computing CSE 8th Semester
32 pages
Sample Question Paper - Object Oriented Programming-12063
50% (2)
Sample Question Paper - Object Oriented Programming-12063
4 pages
Bengali Indic Input 2-User Guide
No ratings yet
Bengali Indic Input 2-User Guide
16 pages
HP-UX Bastille Version B.3.3 User Guide: HP Part Number: 5900-0871 Published: June 2010 Edition: 1
No ratings yet
HP-UX Bastille Version B.3.3 User Guide: HP Part Number: 5900-0871 Published: June 2010 Edition: 1
72 pages
S7-PLCSIM - Interface of S7ProSim - Manual
No ratings yet
S7-PLCSIM - Interface of S7ProSim - Manual
50 pages
Computer Science Presentation
No ratings yet
Computer Science Presentation
11 pages
Storage Classes in C
No ratings yet
Storage Classes in C
6 pages