Advanced Database Chapter 6 and 7

Distributed Database Systems (DDBMS) integrate data across multiple sites, allowing for decentralized storage while appearing centralized to users. They utilize various data allocation strategies, such as centralized, partitioned, and replicated, and involve components like local and distributed DBMS, global system catalog, and data communication. DDBMS face challenges in query processing, transaction management, and security, but offer advantages like data sharing, reliability, and scalability.

Uploaded by

alemunuruhak9

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Advanced Database Chapter 6 and 7

Uploaded by

alemunuruhak9

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Distributed Database Systems

Advanced Database Systems

Distributed Database Systems
 Database development facilitates the integration of data available in an
organization from a number of applications and enforces security on
data access on a single local site.
 But it is not always the case that organizational data reside in one
central site.
 This demand databases at different sites to be integrated and
synchronized with all the facilities of database approach.
 This is will be made possible by computer networks and data
communication optimized by internet, mobile and wireless computing
and intelligent devices.
 This leads to Distributed Database Systems.
Distributed Database Systems
 Distributed Database is not a centralized database.
Distributed Database Systems
 Distributed DB stores logically related shared data and metadata at
several physically independent sites connected via network.
 Distributed DBMS is the software system that permits the
management of a Distributed DB
 Data allocation is the process of deciding where to allocate/store
particular data item.
 There are 3 data allocation strategies:
Data Allocation
There are 3 data allocation strategies:
1. Centralized: the entire DB is located at a single site and computers access through the network.
Known as distributed processing.
2. Partitioned: the DB is split into several disjoint parts (called partitions, segments or fragments)
and stored at several sites
3. Replicated: copies of one or more partitions are stored at several sites
 In a distributed database system, the database is logically stored as single database
but physically fragmented on several computers.
 The computers in a distributed system communicate with each other through
various communication media, such as high speed buses or telephone line.
Distributed Database Systems
 A distributed database system has the following components.
1. Local DBMS
2. Distributed DBMS
3. Global System Catalog (GSC)
4. Data communication (DC)
 A distributed database system consists of a collection of sites, each of
which maintains a local database system (Local DBMS) but each local
DBMS also participates in at least one global transaction where
different databases are integrated together.
 Local Transaction: transactions that access data only in that single site
 Global Transaction: transactions that access data in several sites.
Distributed Database Systems
 Three architectures for parallel DDBMS:
 Shared Memory- for fast data access for a limited number of processors.
 Shared Disk- for application inherently centralized
 Shared nothing.- massively parallel
 What makes DDBMS different is that
 The various sites are aware of each other
 Each site provides a facility for executing both local and global transactions.
 The different sites can be connected physically in different topologies.
 Tree Network,
 Star Network and
 Ring Network
Distributed Database Systems
 The distribution of the database sites could be:
 Large Geographical Area: Long-Haul Network
 relatively slow
 less reliable
 uses telephone line, satellite
 Small Geographical Area: Local Area Network
 higher speed
 lower rate of error
 coaxial, fiber optics
Distributed Database Systems
 Even though integration of data implies centralized storage and
control, in distributed database systems the intention is different.
 Data is stored in different database systems in a decentralized manner
but act as if they are centralized through development of computer
networks.
 A distributed database system consists of loosely coupled sites that
share no physical component and database systems that run on each
site are independent of each other.
 Those which share physical components are known as Parallel DBMS.
 Transactions may access data at one or more sites.
Functions of DDBMS
 DDBMS have the following functionality.
 Extended Communication Services to provide access to remote sites.
 Distributed Query Processing - optimization of query remote data access.
 Extended security- access control to a distributed data.
 Extended Concurrency Control –maintain consistency of replicated data.
 Extended Recovery Services- failures of individual sites and the
communication line.
Issues in DDBMS
 How is data is stored in DDBMS
There are several ways of storing a single relation in distributed database
systems.
Replication
 System maintains multiple copies of similar data (identical data)
 Stored in different sites, for faster retrieval and fault tolerance.
 Duplicate copies of the tables can be kept on each system (replicated). With this option, updates to
the tables can become involved (of course the copies of the tables can be read-only).
 Advantage: Availability, Increased parallelism (if only reading)
 Disadvantage: increased overhead of update
Issues in DDBMS
 How is data is stored in DDBMS
There are several ways of storing a single relation in distributed database
systems.
Fragmentation
 Relation is partitioned into several fragments stored in distinct sites
 The partitioning could be vertical, horizontal or both.
Issues in DDBMS
 How is data is stored in DDBMS
There are several ways of storing a single relation in distributed database
systems.
Horizontal Fragmentation
 Systems can share the responsibility of storing information from a
single table with individual systems storing groups of rows.
 Performed by the Selection Operation
 The whole content of the relation is reconstructed using the UNION
operation
Issues in DDBMS
 How is data is stored in DDBMS
There are several ways of storing a single relation in distributed database
systems.
Vertical Fragmentation
 Needs attribute with tuple number (the primary key value be repeated.)
 Performed by the Projection Operation
 The whole content of the relation is reconstructed using the Natural JOIN operation
using the attribute with Tuple number (primary key values).
Issues in DDBMS
 How is data is stored in DDBMS
There are several ways of storing a single relation in distributed database
systems.
Both (hybrid fragmentation)
 A system can share the responsibility of storing particular attributes of a
subset of records in a given relation.
 Performed by projection then selection or selection then projection
relational algebra operators.
 Reconstruction is made by combined effect of Union and natural join
operators.
Issues in DDBMS
Fragmentation is correct if it fulfils the following
 Complete: - a data item must appear in at least one fragment of a given
relation R (R1, R2…Rn).
 Reconstruction:- it must be possible to reconstruct a relation from the
fragments.
 Disjointness: - a data item should only be found in a single fragment except
for vertical fragmentation (the primary key is repeated for reconstruction).
Data Transparency
The degree to which system user may remain unaware of the details of how
and where the data items are stored in a distributed system.
 Distribution transparency Even though there are many systems they appear as one-
seen as a single, logical entity.
 Replication transparency Copies of data floating around everywhere also seem like just
one copy to the developers and users
 Fragmentation transparency A table that is actually stored in parts everywhere across
sites may seem like just a single table in a single
 Location Transparency- the user doesn‘t need to know where a data item is physically
located.
How does it work ?
Distributed computing can be difficult to implement, particularly for
replicated data that can be updated from many systems.
In order to operate a distributed database system has to take care of
 Distributed Query Processing
 Distributed Transaction Management
 Replication Data Management. If you are going to have copies of data on many
machines how often does the data get updated if it is changed in another system? Who is
in charge of propagating the update to the data?
 Distributed Database Recovery. If one machine goes down how does that affect the
others.
 Security: Just like any computer network, a distributed system needs to have a common
way to validate users entering from any computer in the network of servers.
Homogeneous and Heterogeneous DDBMS

In a homogeneous distributed database

 All sites have identical software (DBMS)
 Are aware of each other and agree to cooperate in processing user requests.
 Each site surrenders part of its autonomy in terms of right to change schemas or software
 Appears to the user as a single system.
In a heterogeneous distributed database
 Different sites may use different schemas and software (DBMS)
 Difference in schema is a major problem for query processing
 Difference in software is a major problem for transaction processing

 Sites may not be aware of each other and may provide only limited facilities for
cooperation in transaction processing.
 May need gateways to interface one another.
Why DDBMS ?/Advantages of DDBMS

Many existing systems

 Possibly there are many different existing system, with possible different kinds of systems
(Oracle, Informix, …) that need to be used together.
Data sharing and distributed control:
 User at one site may be able access data that is available at another site.
 We will have local as well as global database administrator
Reliability and availability of data: If one site fails the rest can continue operation
Speedup of query processing: Query can be sent to the least heavily loaded sites.
Expansion (Scalability)
Disadvantages of DDBMS

 Software Development Cost: Is difficult to install, thus is costly

 Greater Potential for Bugs: Parallel processing may endanger correctness
of algorithms
 Increased Processing Overhead: Exchange of message between sites –
high communication latency.
 Increased Complexity and Data Inconsistency Problems: Since clients
can read and modify closely related data stored in different database
instances concurrently.
 Security Problems: network and replicated data security.
Query Processing in DDBMS

we have to consider the following in distributed query processing:

 Cost of data transmission over the huge network
 Gain of parallel processing of a single query
 For the case of Replicated data allocation, even though parallel processing is used to
increase performance, update will have a great impact since all the sites containing the
data item should be updated.
 For the case of fragmentation, update works more like the centralized database but
reconstruction of the whole relation will require accessing data from all sites containing
part of the relation.
 There are different ways of executing a query.
 Then one can select the strategy that will reduce the data transfer cost for this specific
query.
Transaction Management in DDBMS
 A Distributed Transaction is a transaction that includes one or more statements that,
individually or as a group, update data on two or more distinct nodes of a distributed
database.
 There are two types of transaction in DDBMS to access data from other sites:
 Remote Transaction: contains only statements that access a single remote node. Thus, Remote Query statement is a
query that selects information from one or more remote tables, all of which reside at the same remote node or site.
 For example, the following query accesses data from the dept table in the Addis schema (the site) of the remote
sales database:
SELECT * FROM Addis.dept@sales.midroc.telecom.et;
 Distributed Transaction: contains statements that access more than one node.
 For example, the following query accesses data from the local database as well as the remote sales database:
SELECT ename, dname FROM Awassa.emp AW, Addis.dept@ sales.midroc.telecom.et AD WHERE AW.deptno = AD.deptno;

 If all statements of a transaction reference only a single remote node, the transaction is
remote, not distributed.
Database Security and Authorization
 Privacy – Ethical and legal rights that individuals have with regard to control over the
dissemination and user of their personal information.
 Database security – Protection of information contained in the database against
unauthorized access, modification or destruction.
 Database integrity – Mechanism that is applied to ensure that the data in the database is
correct and consistent.
 A good database security management system has the following characteristics:
 Privacy signifies that an unauthorized user cannot disclose data
 Integrity ensures that an unauthorized user cannot modify data
 Availability ensures that data be made available to the authorized user unfailingly
 Copyright ensures the native rights of individuals as a creator of information.
 Validity ensures activities to be accountable by law.
Database Security and Authorization
 Database Security - the mechanisms that protect the database against intentional or
accidental threats. Database security encompasses hardware, software, people and data.
 Database security and integrity is about protecting the database from being inconsistent
and being disrupted. We can also call it database misuse.
 Database misuse could be Intentional or Accidental, where accidental misuse is easier to
cope with than intentional misuse.
 Accidental inconsistency could occur due to:
 System crash during transaction processing
 Anomalies due to concurrent access
 Anomalies due to redundancy
 Logical errors
Intentional misuse could be:
 Unauthorized reading of data
 Unauthorized modification of data or
 Unauthorized destruction of data
Levels of Security Measures
Security measures can be implemented at several levels and for different components of the
system. These levels are:
 Physical Level: concerned with securing the site containing the computer system. The
site or sites containing the computer systems must be physically secured against armed or
sneaky entry by intruders.
 Human Level: concerned with authorization of database users for access the content at
different levels and privileges.
 Operating System: concerned with the weakness and strength of the operating system
security on data files.
 Database System: concerned with data access limit enforced by the database system.
 software-level security: with the network software is as important as physical security,
both on the Internet and networks private to an enterprise.
Authentication
 All users of the database will have different access levels and permission for different
data objects, and authentication is the process of checking whether the user is the one
with the privilege for the access level.
 Is the process of checking the users are who they say they are.
 Each user is given a unique identifier, which is used by the operating system to determine
who they are.
 Thus the system will check whether the user with a specific username and password is
trying to use the resource.
 Associated with each identifier is a password, chosen by the user and known to the
operation system, which must be supplied to enable the operating system to authenticate
who the user claims to be.
Authorization/Privilege
 Authorization refers to the process that determines the mode in which a particular
(previously authenticated) client is allowed to access a specific resource controlled by a
server.
Forms of user authorization on the data
 Read Authorization: the user with this privilege is allowed only to read the content of
the data object.
 Insert Authorization: the user with this privilege is allowed only to insert new records
or items to the data object.
 Update Authorization: users with this privilege are allowed to modify content of
attributes but are not authorized to delete the records.
 Delete Authorization: users with this privilege are only allowed to delete a record and
not anything else.
Authorization/Privilege
 Authorization refers to the process that determines the mode in which a particular
(previously authenticated) client is allowed to access a specific resource controlled by a
server.
Forms of user authorization on the database schema
 Index Authorization: deals with permission to create as well as delete an index table for
relation.
 Resource Authorization: deals with permission to add/create a new relation in the
database.
 Alteration Authorization: deals with permission to add as well as delete attribute.
 Drop Authorization: deals with permission to delete and existing relation.
Reading Assignments

 Discretionary Access Control Based on Granting /Revoking of

Privileges
 Mandatory Access Control for Multilevel Security
 Statistical DB Security

Distibuted Database Management System Notes
No ratings yet
Distibuted Database Management System Notes
58 pages
Uploading Excel Spreadsheets Into Ebusiness Suite: Oracle
No ratings yet
Uploading Excel Spreadsheets Into Ebusiness Suite: Oracle
13 pages
Distributed Databases
No ratings yet
Distributed Databases
46 pages
Distributed Database System
No ratings yet
Distributed Database System
4 pages
Intro To DDBMS
No ratings yet
Intro To DDBMS
12 pages
DDB-distribution Database Important.
No ratings yet
DDB-distribution Database Important.
15 pages
Assignment 01
No ratings yet
Assignment 01
6 pages
Distributed Databases Introduction
100% (1)
Distributed Databases Introduction
16 pages
Unit-Iii Distributed Database: System
No ratings yet
Unit-Iii Distributed Database: System
55 pages
Unit 4 DBMS
No ratings yet
Unit 4 DBMS
15 pages
Advanced Data Base Management Systems
No ratings yet
Advanced Data Base Management Systems
35 pages
CH.4
No ratings yet
CH.4
16 pages
Distributed Databases
No ratings yet
Distributed Databases
39 pages
MC4202 - Adavanced Database Technology
No ratings yet
MC4202 - Adavanced Database Technology
159 pages
ADT Notes
No ratings yet
ADT Notes
36 pages
Distributed DB
No ratings yet
Distributed DB
4 pages
Distributed DB
No ratings yet
Distributed DB
16 pages
Unit 2 DDMS
No ratings yet
Unit 2 DDMS
26 pages
What Is A Distributed Database
No ratings yet
What Is A Distributed Database
8 pages
Distributed Database
No ratings yet
Distributed Database
9 pages
Distributed Database Vs Conventional Database
50% (2)
Distributed Database Vs Conventional Database
4 pages
Chapter 6 Distributed System Management
No ratings yet
Chapter 6 Distributed System Management
12 pages
Distributed Database: Source
No ratings yet
Distributed Database: Source
19 pages
Distributed Database
No ratings yet
Distributed Database
12 pages
Distributed Database Management
No ratings yet
Distributed Database Management
7 pages
ADBMS Presentation_new.pptcollage
No ratings yet
ADBMS Presentation_new.pptcollage
5 pages
Unit 1
No ratings yet
Unit 1
12 pages
UNIT- 1 DDB
No ratings yet
UNIT- 1 DDB
34 pages
DBMS Unit 1.1
No ratings yet
DBMS Unit 1.1
6 pages
Advance Concept in Data Bases Unit-3 by Arun Pratap Singh
100% (2)
Advance Concept in Data Bases Unit-3 by Arun Pratap Singh
81 pages
System Admin and Server Integration
No ratings yet
System Admin and Server Integration
3 pages
Distributed DB
No ratings yet
Distributed DB
43 pages
Distributed Database
100% (1)
Distributed Database
24 pages
Unit - 2 (1) DBMS
No ratings yet
Unit - 2 (1) DBMS
25 pages
DDBS Lec1
No ratings yet
DDBS Lec1
20 pages
Distributed Databases: Centralized Database System Distributed Database System Advantages and Disadvantages of DDBMS
No ratings yet
Distributed Databases: Centralized Database System Distributed Database System Advantages and Disadvantages of DDBMS
26 pages
Distributed Data Model
No ratings yet
Distributed Data Model
11 pages
Distributed Databases: Indu Saini (Research Scholar) IIT Roorkee Enrollment No.: 10926003
No ratings yet
Distributed Databases: Indu Saini (Research Scholar) IIT Roorkee Enrollment No.: 10926003
14 pages
ADS Chapter 7 Distributed Database
No ratings yet
ADS Chapter 7 Distributed Database
16 pages
DDS Unit - 1-1
No ratings yet
DDS Unit - 1-1
22 pages
ddb unit 1-5
No ratings yet
ddb unit 1-5
190 pages
Distributed Database System
No ratings yet
Distributed Database System
15 pages
Question No 1 DDBMS Advantages and Disadvantage:: Example
No ratings yet
Question No 1 DDBMS Advantages and Disadvantage:: Example
3 pages
DDBMS Questions Answers
No ratings yet
DDBMS Questions Answers
4 pages
Adt Unitnotes 1to3
No ratings yet
Adt Unitnotes 1to3
107 pages
Advance DB Notes
No ratings yet
Advance DB Notes
5 pages
Adt Unit I
No ratings yet
Adt Unit I
18 pages
Advanced Database Management System
No ratings yet
Advanced Database Management System
6 pages
2 RDBMS Unit 2
No ratings yet
2 RDBMS Unit 2
21 pages
Unit 2-DBP
No ratings yet
Unit 2-DBP
44 pages
Distributed DBMS
No ratings yet
Distributed DBMS
62 pages
Practical No. 1: Aim: Study About Distributed Database System. Theory
No ratings yet
Practical No. 1: Aim: Study About Distributed Database System. Theory
22 pages
Distributed Database Design: Basics
No ratings yet
Distributed Database Design: Basics
18 pages
Unit V NoSQL Databases
No ratings yet
Unit V NoSQL Databases
124 pages
ADBMS Exam Question Answers
No ratings yet
ADBMS Exam Question Answers
54 pages
Distributed DBMS
No ratings yet
Distributed DBMS
7 pages
Distributeddbms Er. Inderjeet Bal
No ratings yet
Distributeddbms Er. Inderjeet Bal
60 pages
Answer Question 8 Assignment Aa
No ratings yet
Answer Question 8 Assignment Aa
6 pages
Distributed Databases
100% (1)
Distributed Databases
26 pages
Database And Computer Management: SERIES 1, #3
From Everand
Database And Computer Management: SERIES 1, #3
Elias Mutegi
No ratings yet
Database Management System
From Everand
Database Management System
Knowledge Flow
No ratings yet
Lecture 3 PDC
No ratings yet
Lecture 3 PDC
21 pages
Counters and Registers
No ratings yet
Counters and Registers
15 pages
12 GB Operating Instructions Uluf Models
No ratings yet
12 GB Operating Instructions Uluf Models
27 pages
CK11N
No ratings yet
CK11N
3 pages
1st generation computers
No ratings yet
1st generation computers
2 pages
EX 200 Answer
No ratings yet
EX 200 Answer
7 pages
CoWIN Overview
No ratings yet
CoWIN Overview
26 pages
HL720 3
100% (1)
HL720 3
471 pages
Lecture-9 (Constructors)
No ratings yet
Lecture-9 (Constructors)
22 pages
Modems in Data Communication
No ratings yet
Modems in Data Communication
38 pages
Book Title:-Inorganic Chemistry For JEE (Advanced) : Part 1: Overview
No ratings yet
Book Title:-Inorganic Chemistry For JEE (Advanced) : Part 1: Overview
3 pages
Service Manual Lexmark X264 - X363 - X364 - 7013
No ratings yet
Service Manual Lexmark X264 - X363 - X364 - 7013
244 pages
Application of Robotic Process Automation
No ratings yet
Application of Robotic Process Automation
10 pages
Resume International
No ratings yet
Resume International
1 page
SWAN Agreement BSNL New 0
No ratings yet
SWAN Agreement BSNL New 0
10 pages
Fsqm-080 Ppap Checklist
100% (1)
Fsqm-080 Ppap Checklist
14 pages
LedOK Kit Phone APP Instructions-V.1.0
No ratings yet
LedOK Kit Phone APP Instructions-V.1.0
12 pages
Maths Test
No ratings yet
Maths Test
2 pages
Module 5-Os
No ratings yet
Module 5-Os
25 pages
enhancing-functionality-and-user-experience-through-usercentered-design-in-architectural-design
No ratings yet
enhancing-functionality-and-user-experience-through-usercentered-design-in-architectural-design
2 pages
637669063
No ratings yet
637669063
6 pages
Get Data Driven Remaining Useful Life Prognosis Techniques Stochastic Models Methods and Applications Hu PDF ebook with Full Chapters Now
No ratings yet
Get Data Driven Remaining Useful Life Prognosis Techniques Stochastic Models Methods and Applications Hu PDF ebook with Full Chapters Now
55 pages
ELECTRICITY Notes
No ratings yet
ELECTRICITY Notes
20 pages
I Tweet Honestly, I Tweet Passionately - Twitter Users, Context Collapse, and The Imagined Audience
100% (1)
I Tweet Honestly, I Tweet Passionately - Twitter Users, Context Collapse, and The Imagined Audience
21 pages
Installationguide T484 120510 Uk
No ratings yet
Installationguide T484 120510 Uk
44 pages
QE-Graphene Band PDF
No ratings yet
QE-Graphene Band PDF
21 pages
Aps Branding Guideline v1.6.4
No ratings yet
Aps Branding Guideline v1.6.4
34 pages
Suvarna Internship
No ratings yet
Suvarna Internship
86 pages
CONTAX N Digital Specification
No ratings yet
CONTAX N Digital Specification
2 pages