0% ont trouvé ce document utile (0 vote)
37 vues73 pages

Kca045 DDBS

Le document traite des systèmes de bases de données distribuées (DDBS), abordant des sujets tels que l'architecture, la conception, le traitement des requêtes, la gestion des transactions et la fiabilité. Il explore également les stratégies de conception comme la réplication et la fragmentation, ainsi que les problèmes techniques liés à la gestion des données distribuées. Enfin, il souligne l'importance de la sécurité, de la gestion des ressources et des défis associés à l'intégrité des données dans un environnement distribué.

Transféré par

anshultyagi42261
Copyright
© © All Rights Reserved
Nous prenons très au sérieux les droits relatifs au contenu. Si vous pensez qu’il s’agit de votre contenu, signalez une atteinte au droit d’auteur ici.
Formats disponibles
Téléchargez aux formats PDF ou lisez en ligne sur Scribd
0% ont trouvé ce document utile (0 vote)
37 vues73 pages

Kca045 DDBS

Le document traite des systèmes de bases de données distribuées (DDBS), abordant des sujets tels que l'architecture, la conception, le traitement des requêtes, la gestion des transactions et la fiabilité. Il explore également les stratégies de conception comme la réplication et la fragmentation, ainsi que les problèmes techniques liés à la gestion des données distribuées. Enfin, il souligne l'importance de la sécurité, de la gestion des ressources et des défis associés à l'intégrité des données dans un environnement distribué.

Transféré par

anshultyagi42261
Copyright
© © All Rights Reserved
Nous prenons très au sérieux les droits relatifs au contenu. Si vous pensez qu’il s’agit de votre contenu, signalez une atteinte au droit d’auteur ici.
Formats disponibles
Téléchargez aux formats PDF ou lisez en ligne sur Scribd
SYLLABU; f ] TOPIC “| database clusters. _| design, architectural issues, object management, distributed Introduction : Distributed Data Processing, Distributed Database System, Promises of DDBSs, Problem areas. Distributed DBMS Architecture: Architectural Models for Distributed OBMS, DDMBS Architecture. Distributed Database Design: Alternative Design Strategies, Distribution Design issues, Fragmentation, Allocation. an Query Processing and Decomposition : Query processing objectives, characterization of query processors, layers of query processing, query decomposition, localization of distributed data. Distributed’ query Optimization: Query optimization, centralized query optimization, distributed query optimization algorithms r Transaction Management °: Definition, properties of transaction, types of transactions, distributed concurrency control: Serializability, concurrency control mechanisms & algorithms, time - stamped &- optimistic concurrency control Algorithms, deadlock Management Distributed DBMS Reliability : Reliability concepts and measures, fault tolerance in distributed systems, failures in Distributed -DBMS, local & distributed reliability protocols, site failures and network partitioning. Parallel Database Systems: Parallel database system architectures, parallel . data placement, parallel query processing, load balancing, Distributed Object Database Management Systems = Fundamental object concepts ‘and models, object distributed ‘object storage; object query Processing. Object Oriented Data Model: Inheritance, object. identity, persistent programming INTRODUCTION — features of relat nal data model. (2022-23) The Relational Model is a database model that represents data in the form of tables or relations. Each table consists of rows and columns, where each column represents an attribute of the entity, and each row represents a record. Features of the Relational Model: (1) | Data’ is representéd in rows and columns called fi relations, . (2) Data is stored in tables having relationships between q them called the Relational model. : i (8) Each column has a distinct name and they are representing attributes. (4) Data is normalized to eliminate redundancy. (5) Data is eaccésséd. using, SQL. (Structured. Query Language). .; « G (6) | Data is represented in a tabular form. é (7)... The : Relational: Model is widely, used, in modern databases because of its, simplicity, flexibility, and -ease of use. It is also easy to, maintain and modify, [Link] a popular. choice for many-applications. o ‘at are the different strategies for designing ‘distributed. database? , (2022-23) ° Write and explain design strategies. "The strategies can be broadly divided into replication ‘and fragmentation é at Data Replication : Data replication is: the process of storing separate copies: of the. database at two or more sites. It’ is. a :popular. fault tolerance technique of distributed databases. Advantages of Data Replication: (1) | Reliability : In case of failure of any site, the database system continues to work since a copy is available at another site(s). Distrunurey OATAE Ase SYBTEMS A @ ta in different sites e in ense of failure i rondor th \ bite, hour ticker Response s So guik ely quick r \ Discuss the distributed database nystem? Also explain the promises of DDBS. (2022-23) ns require less nt sitos and hus, they, ‘A Distributed Database System (DDBMS) is a database’ management aystem that manages a database that is spread across multiple computers or sites. In a DDBMS, each site has its own database, and the databases fare connected to each other to form a single, integrated systom The main advantage of a DDBMS is that it can provide higher availability and reliability than a Intabase system. Here. are some of the bocome simp] Disadvantages of Data Replicat é GQ) Increased Storage Requirements : Maintaining Inulniple copies of data is associated with inereased The storage space required is in rage required for a centralized, ent management of distributed, fragmented, \ and replicated data. (2) Improved reliability/availability through distributed (2) Increased Cost and Complexity of Data Updating : Each time a data item is updated, the, update needs to be reflected in all’ the copies of the US angactions! Gata at the different sites. This requires complex (8) Improved performance. synchronization techniques and protocols. (4) Easier and more economical system expansion. (3) Undesirable Application ~ Database Coupling Disteibuted databases are used in many applications, If complex update mechanisms are not used, including gorporate management information systems, & commerce, and online banking... They offer several advantages over centralized databases, including improved | scalability, . availability, performance, flexibility, fault ‘tolerance, and security. removing data inconsistency requires complex .co- ordination at application level. ‘This results in undesirable application — database coupling. Fragmentation : Fragmentation is the task of dividing a table into a set of smaller tables. The subsets of the table are called fragments. Advantages of Fragmentation : (@) Since data is stored close to the site of usage, efficiency of the database system is increased. (2) Local query optimization techniques are sufficient for most queries since data is locally available. (8) Since irrelevant data ie not available at the sites, security and privacy of the database system can be’ (1) | Distributed Concurrency Control = Diseributed, Disadvantages of Fragmentation : Concurrency Control speeifies that synchronization of Q) Whei sistent fresaia feces to tho’ distributed database such that en data from different fragments are required, the integrity of the database is maintained. To maintain access speeds may be very low, y @ Discuss various technical problems that need to be resolved, to realize. the full potential of \ DDBMS. (2032-23) ‘© What are thé problems areas of Distributed Database? Following are ‘the Problems Areas of Distributed database : 2 @ “ 6) Management? | users are request database if the resources then database grant ‘fnot available the user haét ‘are released by other Ase Deadlock Distribute ise several da ailable at ‘sources to that User uy it until the resources by some ot] I. Distributed Deadlock is manage Known as Dea rr ne different algorithm and techniques uel proidance and detection algorithm, | Replication Control : Replication is a, téshniaig seers applies to distributed systems. A Aatab {hal id to be replicated if the entire database oF portion of it (a table, some tables, 0 Pergments, etc) is copied and the copies aré stored 4 different sites. ‘The issue with having more than on copy of a database is maintaining the mutua copkisteney of the copies-ensuring that all copies havd jdentical schema and data content. ao Operating Environment : To Distributed Database Environment a Operating System is requirement ; Organizational needs. Operating system plays ang important role for managing the distribute i database. Sometime Operating system. is ing supported for Distributed database. : Transparent Management ‘Transparent management of Data is one of the major probl areas in Distributed database. In Distril database data is situated in multiple locations number of users are used that database. To maint the integrity of database transparent managemelit J data is important. a Security and Privacy : How to apply the secutit® policies to the interdependent system is a great isis in distributed system, Since distributed systems. DistRisuteD DATABASE SYSTEMS “Give @ brief account of architect PMT we ‘The basic types of distributed DBMS are as follows = (1). Client Server TAs) with sensitive data, and information so the system and privacy must have a strong security jmeasurement. Protection of distributed system ‘assets, including base resources, . storage, communications and user-interface VO as wel a5 of these resources, display windows and more distributed higher level composites processes, files, messages, ‘complex objects, are important issues in ‘aystem. Resource Management : In distributed systems, ‘objects consisting of resources are located on different places, Routing is an issue at the network layer of the Kistributed system and at the application layer: Resource management in a distributed system will interact with its heterogeneous Nature. tural models (2022-23) for distributed DBMS. of Distributed DBMS. Explain Architectures ‘Architecture of Distributed system. (a) A clignt server architecture has a number of clients: and. a few servers connected in a network. ‘Adient sends a query to one of the servers. The earliest available server solves it and replie: ‘A Client-server architecture is simple to implement and execute due to centralized server system. (b) (©) ‘Communication (Figure : Client - Server Architecture) Ae) TGA ror MCA DistRInUTED DATANASE ByaTeMS {a7 6. Why distributed databases are essential? (2021-22) % What are distributed databases? is di In. distributed systems, different ‘database systems of an organization, ‘These | ‘database systems are connected via communication links. Such links help the end-users to access the data ea: Examples of the Distributed database are Apache Cassandra, HBase, Ignite, etc Cs SERVER (Figure : Collaborating Server Architecture) (a) Collaborating server architecture is designed to run a single query on multiple servers. (b) Servers break single query into multiple small queries and the result is sent to the dlient. (©) Collaborating server architecture’ has a collecti of database servers. Each server is capable for executing the current transactions across the databases. idleware Architecture : ‘ddleware architectures are designed in such a that single query is executed on n servers. (b) This system needs only one server which is capable of managing queries and’ transactions from multiple servers. (©) Middleware architecture uses local servers to handle local queries and transactions. (A) The softwares are used for execution of queries and transactions across one or more independent 7 cabane sever, this type of software is called : mene | ‘Figure { Distributed Database System) s "The above diagram-is a typical example of distrivuted \ database system, in which communication channel is used “to communicate: with, the/ different locations and every system has its own memory and database. @) 7. Explain briefly about Fragmentation i eatie examples. Fragmentation is a process of dividing the whole or full database into various sub tables or sut ‘that data can be stored in different systems. ae call pieces of sub relations e9 are called fragments, ‘These fragments are called log al dat tits and are at various sites. t be made stu Ceca are auch e used to reconstruct the original relati iny loss of data), ere isn! the use of UI 1! ‘auments, This process is called datas eae fragments are independes ments © d logically should not concern and this is calle fragmentation. operation o% fragment which meai The users needn't be jon which means they is fragmented 1e or we can Say Advantages ! @) As the data is stored close to the usage site, ‘GHiiciency of the database system will increase zation methods are sufficient for) come queries as the data is available locally (8) In order to maintain the security and:privacy of # database system, fragmentation is advantageous Disadvantages : (@) Access speeds may bs fragments are needed (2) Ifwe are using recursive fragmentation, be very expensive We have three methods for data fragmenting of a table : (Q) Horizontal fragmentation (2) Vertical fragmentation (@)_ Mixed or Hybrid fragmentation Lat’s discuss them one by one. Horizontal Fragmentation : Horizontal fragmentatio refers to the process of dividing a table horizontally assigning each row or (a group of rows) of relation to one of more fragments. These fragments are then be assigned to different sides in the distriouted system. Some of the rows or tuples of the table are placed in one system and the rest are placed in other systems, The rows that belong to the horizontal fragments are specified by a condition on onie oF more attributes of the relation. In rolational a | Sopizontal fragmentation on table 7, can be wpa oT) (2) Local query optit 1e very high if data from differe! then it wi For this purpose, ee where, ¢ is relational algebra operator for selection pis the condition satisfied by « horizontal fragment © For example, consider an EMPLOYEE table (7): Eno [ Ename | Design | Salary | Dep Tor {A _| abe _| 3000 702 [ 8 | abe | 4000. 103-| GC | sabe | 5500. 704 | 0 [abe | 5000 705 | [abe] 2000 This EMPLOYEE table can be divided into different fragments like : EMP 1= Dep = 1 EMPLOYEE EMP 2= Dep =2 EMPLOYEE ‘These two fragments are: TI fragment of Dep = 1 Eno | Ename | Design | Salary | Dep For, As | abe | sooo 7 |, 702] Babe | 4000 [+ sagment on the basis of Dep = 2 will be = Similarly, the 72 fr Eno | Ename | Design | Salary | Dep Toa f|_¢ | abe | 5500 | 2 704] D~ | abe | 5000 [7 2 105 ‘abe | 2000 [2 Now, here itis possible to get back T as T=TIUT2U...UTN Vertical Fragmentation : Vertical fragmentation refers to the. process of decomposing a table vertically by attributes are columns. In this fragmentation, some of the attributes are stored in one system and the rest are stored in other systems. This is because each site may not need all ‘columns of a table. In order to take care of restoration, each fragment must contain the primary key field(s) in a table. The fragmentation should be in such a manner that we can rebuild a table from the fragment by taking the natural JOIN operation and to make it possible we need to jnclude a special attribute called Tuple-id to the schema. ‘a user can use any super key. And by ‘this; the’ tuples or rows can be linked together. The projection is as follows : ‘xal,a2,.,.an(T) where, 1 is relational algebra operator “al... an are the aatriubutes of 7 Tis the table (relation) po Qe sage SysteMs pre SA a ates AL a fragmentation is: Boaign abe. Complexity : DBAs may have to do extra work to ensure that the distributed nature of the system is transparent. Extra work must also be done to maintain multiple disparate systems, instead of one big one. Extra database design work must also be ‘This is T2 and to get back to the original T, we joig done to account for the disconnected nature of the 0 fragments TI and 72 as *gsipzoves (71% 72) 7% databgegfors'example, joing “become peobihitively expensive when performed across multiple systems. (2) Economics : Increased complexity and a more extensive infrastructure means extra labour costs ¢) Security: Remote database fragments must be secured, and they are not centralized so the remote sites must be secured as well. The infrastructure must also be secured (for example, by encry ig the work links between remote sites. Difficult to Maintain Integrity : But in a distributed’ database, enforcing integrity over a network may require too much of the network's resources to be feasible .on For defining this type’ of fragmentation we ELECT and the PROJECT operations of igebra. In some situations, the horizontal the vertical fragmentation isn’t enough to distribute one we need ‘Mixed fragmentation can be done in two differ G) The first method is to first create a set or grot horizontal fragments and then . feughonG’iam one or more of the’ GE Inexperien tributed databases are difficult to fragments, work with, and in such a young field there is cot ® e othod is much readily available experience in “prover” practice Lack of Standards : There are no tools or ‘methodologies yet to help users convert a centralized DBMS into a distributed DBMS. What is Distributed data processing (DDP)? Distributed data proce: multiple computers, across different locations shar: ity. In DDP, specific jobs ar below is the performed by removed from tho the relation in < for storing Write some approaches for sori distributed data storage a (2) Gh Namseeneous DDB : Those database systems rarer on the same operating system and use n process and carry the same @) @ ©) (8) (Figure : Homogeneous Distributed System) Heterogeneous DDB : Those database systems which execute on different operating systems unde different application procedures, and carries different hardware devices. i 13. @ igure : Heterogeneous Distributed System) Write the advantages of distributed database. NE eee BT ed ~ data, Thu tas Some general features of distributed databases are : Location Independency : Data is physically stored DDBMS. Distributed Query Processing : Distributed databases answer queries in a distributed environment that manages data at multiple sites. HighTevel queries are transformed into a query execution plan for simpler management. Distributed Transaction Management : Provides a consistent distributed database through commit Protocols, distributed concurrency control techniques, and distributed recovery methods in case of many transactions and failures. Seamless-Integration : Databases in a collection usually represent a single logical database, and they are interconnected, Network Linking : All databases in a collection are inked by a network and communicate with each other. Transaction Processing :-Distributed databases incorporate transaction processing, which is a program including a collection of one or more database operations. Transaction processing is an atomic process that is ither entirely executed or not at all. ‘The, Advantages of Distributed Database is as Hardware, Operating System, Network and | wation Independence. It provides Continuous operation. © ©) Explain the function of DDBMS. Pee (2) Receive an applications request. (2) Validate analyze & decompose the request. (8) [Link] request logical to physical data components. (4) Search for locate read & validate the data. (3) Ensure data base consistency security. What are the components of DDBMS? LES Computer Work Stations : (sites) network system. DDBMS must be independent of the computer system, hardware, fee Network Hard ware & Software Components : This is present in each work station ..The network components allows all sites to internet & exchange data. Because the components like computers, as network hard ware & so on to be supplied by different vendors 80 that's why DDBMS functions can be run on multiple’ platforms, Communication Media : This carry the data from one workstation to another. The DDBMS must be communications “media independent. 4 18. Write 1as5) at able t0 support several types of communication media. ‘Transaction processor (TP): It is a software component found in each computer that requests data. ‘TP receive & process the applications data requests. ‘TP also, known as application processor (AP) or the ‘Transaction manager(TM). Data Processor (Dp) : It is a software component found in cach computer that stores & retrieves data located at the size, DP also known as data manager (DM). What is Horizontal Fragmentation? Horizontal fragmentation refers to the process of dividing a table horizontally by assigning each row or (@ group of rows) of relation to one or more fragments. ‘These fragments are then ve assigned to different sides in © the distributed system, What is Vertical Fragmentation? Vertical fragmentation refers to the process of decomposing a table vertically by attributes are columns. In this fragmentation, some of the attributes are stored in one system and the rest are stored in other systems. ‘This is because each’ site may not need all columns of atable. the parameters in. which DDBMS architectures is depend. DDBMS architecturesare generally depending on three parameters : (2) Distribution : It states the physical distrib data across the different sites. (2) Autonomy : It indicates the distribution of control of “the database system and the degree to which each constituent DBMS can operate independently (3) Heterogeneity, : It refers to the uniformity or dissimilarity of the data models, system components and ‘latabases. 120 DATAMASE SYSTEMS, tan St | @)_ Multi-database Internal + Depicts the data ribution across different sites and multi-database to local data mapping, (4) Local Database View Ler local data. (8) Local Database Conceptual Level : Depicta local data organization at each site. (6) Local Database Internal Level : Depicts physical data organization at each site. ‘There are two design alternatives for multi-DBMS Q) Model with multi-database conceptual level. (2) Model without multi-database conceptual level. Mulb-daabace wie tis _Uinmnateen View Depicts public view of ly has four levels of schemas ‘oneeptual Schema : Depicts the glo emai JF Srerat soema2 senor [Mt catabace Conceptual Senema ‘Gibal Concoplual Shera (Figure) the Multi - DBMS Architectures’ fo This is an integrated database system formed. by collection of two or more autonomous database systems. Multi-DBMS can be expressed through six levels schemas : i Multi-database View Level : Depicts multiple us views comprising of subsets of the integr distributed database. Multi-database Conceptual Level integrated multi-database gical multi-database structure schema nt} [yew wa 3 Depictsigllm (Figure : Model without Multi-Database Conceptual Level) comprises of glo - definitions oe |. What is Hybrid Fragmentati IT 7: heen | nents with “at the original table is often an expel Hybrid fragmentation ways QQ). At first, gener . vertical fragments from one or more oj fragments, set of vertical fragments; | fragments from one or mo ments, DDBMS (2) Distributed Nature of Organizational Units} Most organizations in the current times al subdivided into multiple units that are physicall distributed over the globe. Each unit requires its o) - set of local data. Thus, the overall database of; organization becomes distributed. Need for Sharing of Data : organizational units often need to communicé wid each other and share their data and reaguced manner. (3) Support for Both OLTP and OLAP : Tra Processing (OLAP) work upon diversified. system which may have common data. Distributed datal systems aid both these processing by. providif ynchronized data. Database Recovery : One of the common techni used in DDBMS jis replication of data ai different sites, Replication of data automaticall helps in data’ recovery damaged. a if database in any site ers can access data from other! (6) Explain the Distainureo DATABASE SyareWs wan ——oee ree while the damaged nite in being reconstructed. Thus, database failure may become silmost inconspicuous to users, Support for Multiple Application Software = Most ‘organizations use a varicty of application software each with its specific database support. DDBMS provides a uniform functionality for using the same data among different platforms. issues in designing Distributed Systems. qa) @) (4) 5) ELM EY YD eee RTT Heterogeneity : Heterogeneity is applied to the network, computer hardware, operating system and implementation of different developers. A key component of the heterogeneous distributed system client-server environment is middleware. Middleware is a ‘set. of services that enables application and end-user to interacts with each other across a heterogeneous distributed system. Openness : The openness of the distributed system ig determined primarily by the degree to which new resource-sharing services can be made available to the users. Open systems are characterized by the fact that their key interfaces are published. It is based on a uniform communication mechanism. and published interface for access to shared resources. It can be constructed from heterogeneous hardware and software. Scalability : Scalability of the system should remain efficient even with a significant increase in the number of users and resources connected. Security : Security of information system has three components Confidentially, integrity and availability. Encryption protects shared resources, keeps sensitive information secrets when. transmitted. Failure Handling : When some faults occur in hardware and the software program, it may produce incorrect results or they may stop before they have completed the intended computation so corrective QUERY PROCESSING AND DECOMPOSITION Explain query processing. Als explain the layers of query processing. (108Lay What is meant by query processing? Query Processing is the activity performed in extracting data from the database. In query processing, if takes various steps for fetching the data from the database. a ig. The user should be unaware of # joes are located and the transferring machine to a remote one should be (3) Evaluation ‘There are two types of quory processors = Single-Phase Commit Query Proce: Accesses and joins information from mul and performs updates to a single data s it Query Processor (CACQPRRS) = ao jata sources, ‘There are four phases in a typical query processing. g and Translation : After scanming the SQL query, the query is parsed for identifying the syntactical errors and the data types are corrected. Once this step is passed, the 4 Differentiate between homogenous. +) heterogeneous DDBMS. (2022: Feature _| Homogeneous DDBMS DEMS [Same DBMS software is|Different DBMS” s lused at each node can be used at each [Schema identical schema] Different 7 structure is used at each}structures can be user Inode leach node, [Data Changes made to the| are ‘automatically|DBMS software may propagated to other|different data models nodes, ensuring datal consistency [Complexity [Homogeneous databases|Heterogeneous databa jare relatively easier tolare “more complex Jmanage as the same|manage as different DBI DBMS software is|software , is. employe jemployed throughout the|throughout the system system f IConsistency}database on one node|guaranteed as - differg) “ywvariety. of query ' -optimizing transformations Inodes must [Link]é samé|nodes can use DBMS software /and|DBMS - software Jschema structure): Fiexbiiiy [Homogeneous databases|Heterogeneous databases oter less flexbilty as’alllotier more flexbilty as| Briefly explain the objectives Discuss the objectives of distributed quel distributed query processing in detail. (2021-22) What are the objectives of query processing? What is query processing in a relatio database? Explain in detail with an examp) How does it differ from distri stapes fr istributed query lectins OMT Baa te The: main objectives of i n query _processin distributed environment is to form a high level wer 01 processing. (2022-23)) Processing. Explain the various phases if DISTRIBUTED DATABASE SYSTEMS: 1. Mistributed database, which is seen as a single database by the users, into an. efficient execution strategy’ expressed in a low level language in local databases, Query Processing and its Phases : Query Processing is the activity. performed in extracting data from the database. In query processing, it takes various steps for fetching the data from the database. ‘The steps involved are: x (Q). Parsing and translation (2) Optimization (8) Evaluation : Parsing and Translation : As query processing includes ceitain activities for data retrieval. Initially, the given user queries ‘get translated in high-level database languages such as SQL. It gets translated into expressions that can be further used at,the physical level of the file system. After this, the actual. evaluation of the queries and ani takes place. Thus before processing a query, a computer eystem noods to translate the query into a human-readable ‘and understandable language. Consequently, SQL or Structured Query Language is the best suitable choice for humans, But, it is not perfectly suitable for the internal representation of the query, to, the. system. Relational algebra is well suited for the internal representation of a query, The translation process’|in query processing is similar to the parser of a query. When a user executes any query, for generating the internal form of the query, the parser in the system checks the ‘syntax of the query, verifies the name of the relation in the database, the tuple, and finally the required attribute'value.-The parser creates fa tee! of ‘the query, known as. ‘parse-tree.’ Further, translate it into the form of relational algebra. With this, it ‘evenly replaces all the use of the views when used in the query. Thus, we can understand the working of a query processing in the below-deseribed diagram Distminuten DATABASE SYSTEMS: 10.5) for constructing the evaluation plan, the user does Reiations need not to write their query efficiently. expression. (2), Usually, a database system generates an efficient query evaluation plan, which minimizes its cost. This type of task performed by the database system and is known as Query Optimization. (8) For optimizing a query, the query optimizer should have an estimated cost analysis of each operation. It is because the overall operation cost depends on the memory allocations to several operations, execution costs, and so on. Finally, after selecting an evaluation plan, the system evaluates the query and produces the output of the query. : ‘execution plan Evaluation engine Discuss the’ query optimization? Explain (Figure : Steps in Query Processing) "distributed cost model with an example. Suppose [Link] executes a query. As we have learned J (2022-23) that there are various mothods of extracting the data from What is query optimisdtion? List distributed the database. In SQL, a user wants to fetch the records of Query ‘optimization algorithms and explain the employees whose salary is greater than or equal to any osu ridin that 10000. For doing this, the following query is undertaken : - select emp_name from Employee where CTT aalary> 10000; J Query optimization is the process of selecting the Ein fe male Se trate undonstind Wie wane Gury rast efficient quey-evaluation plan from among the many We can bring this giiery int tis Tolational algebra forme ae" strategies usually possible for a given query. It does not Pte ewe ieee onal algebra form a: expect users to write their queries so that they can be (D)Gastary>10000 satay (Employee) processed efficiently, Rather, it expects the system to (2) Fastary @satary>10000 (Employee)) construct a query evaluation plan that minimizes the cost After translating the given query, we can execute’ of query evaluation. a i. each relational algebra operation by using different Distributed Query Optimization Algorithm : algorithms. So, in this way, a query processing begins its @) - Semijoin Algorithm eee (2) INGRES Algorithm working, Evaluation : For this, with addition to the relationalls (3) ° System R Algorithm algebra translation, it is required to annotate thé (4) System R* Algorithm translated relational algebra expression with thei (5) Hill Climbing Algorithm instructions used for specifying and evaluating eacl (6) SDD-1 Algorithm operation. Thus, after translating the user query, t Hill Climbing Algorithm : system executes a query evaluation plan. ‘ (1) It refinements of an initial feasible solution are Optimization : recursively computed until no more cust () The cost of the query evaluation can vary for differe improvements can be made, types Of queries. Although the system is responsible 0 ‘eerscnmtcee Ing tho quory. can boo using tho following formul ‘Total Cont’ Data ‘ranafor Cost + Processing, b+ Coors wsfor cost ia the cost of transferring data teny USO) wo sites, In this case, the data transfor ‘y global execution schedule thi coat is caleulntod us follow Gucos all intorsite communication Data Transfor Cost = Size of EMPLOYEE table * (ii) Determine the candidate result sites, jf Coat por byte + Sizo of DEPARTMENT table * Cont whore a relation referenced in the quer per byto Sl a (8) ‘Tho processing cost is the cost of processing the query Gii) Compute the cost of transferring all the at aaah lia play ye case, the processing cost is other eons relations to ‘each | Processing Cast = Cost por CPU cycle * Number ww 80s candigate site with mininmam cost of CPU cycles required, to process the query at each ix) ESO= candidate site w ’ ce iH) Split 90 into ewo stratepinn + RSE fellowes hy (4) The coordination cost is the cost of coordinating the we results of the query across all sites. In this case, the (@_ ESI: send one of the relations involved in ‘coordination cost is calculated as follows: the join to the other relation’s site — ' Coordination Cost = Size of Result Set * Cost per Gi) ES2: send the join result to the ‘final result byte site By estimating the data transfer cost, processing cost, {c) Replace ESO with the split schedule which gives and, coordination cost, we can estimate the total cost of (@) Recursively apply steps 2 and 3 on ES1 and ES2 executing the query on the distributed database system. until no more benefit can be gained j (e) Check for redundant transmissions in the final What are homogenous and heterogeneous database. E Give the architecture of heterogeneous database PEE (rains along with some query processing issues. (2021-22) In a Distributed Database System (DDBMS), a distributed cost model is used to estimate the cost of executing a query on a distributed database. The cost model takes into account the cost of transferring data between sites, the cost of processing the query at each site, z and the cost of coordinating the results of the query across [Patines Dass Ervonnen all sites. Here is an example of a distributed cost model in a DDBMS: ‘ Consider # distributed database system with three sites: Site 1, Site 2, and Site 8. Suppose we want to execute 4 query that involves joining two tables, EMPLOYEE and DEPARTMENT, which ate stored at Site 1 and Site 2, respectively. The query requires transferring data betw: : the two sites and processing the query at cachalign er. Distributed databases can be broadly classified into homogeneous and heterogeneous distributed database } environments, each with further sub-divisions, as shown in the following illustration, emopenaove], [Retereganarza] fisrhaeeonon] [Ferd] [meses DisteiouteD DATABASE SYSTEMS (89) CTT A Query processing in a distributed database management aye en the transmission of data between the computers in a strategy for a query in the ordering of data transmissions and local data processing in a database syatem. Generally, her sites and cooperal a query in Distributed DBMS requires data from multiple user requests. sites, and this need for data from different sites is rough a single interfa called the transmiasion’of data that causes communication aaa costs, = PL ee 4 Query processing in DBMS is different from query Es errogencous distributed database, differs processing in centralized DBMS duc to this communication sent operating systems, DBMS products ai cost of data transfer over the network. The transmission m yals Tee propartiee ar f cost is low when sites are connected through high-speed milar schemas and software/@MMlMl Networks and is quite significant in other networks ites use dist composed of a variety of DBM 6, Explain the architeclure of Distributed Query Processing. H In a ‘distributed database system, processing a schemas query comprises of optimization at both the global and the G4) Transaction props local evel. ‘The query enters’ the database system at the cliont or controlling site. Here, the user is validated, the query. ie. checked, translated, and optimized at a global level The architecture can be represented as : software. 4 A site may not be aware of other sites and so thore i limited co-operation in processing user r at Global Level Distributed Execution Manager (Figure : Heterogeneous Distributed System) t ea esanin Heterogeneous distributed database system i network of two or more data bases with diffe 5, DBMS software, which can be stored on one or mOq machines. In this system data can be accessible to seve ses in the network with the help of genefi tivity (ODBC and JDBC), Local Query Optimization ‘Local Query {Local Query Optimization (Figur) 12,_What ts Query optimization In centralized Systema? coas path ix determine oe | alternative accoss paths are dorived for the a algebra exprossion, ‘This chapter focus o ‘ optimization in contralized system, 8 : pio iat? Proconting for a contralized aystem is done to achiove ! (1) ‘The response time of a query is minimized. mization is the process of selecting (2). The system throughput is maximized jan for evaluating the query. Aft (3) Tho memory and storage used for processing is parsed query is passed to quel reduced. different execution plans Parallelism is increased, \d select the plan with’ lea after the inal on query 8 Query oP efficient executio parsing of the que~7 optimizer, which gener evaluate parsed query an 13. What is Query Parsing and Translation? ii F After scanning the SQL query, the query is parsed for 8. _What is meant by query Decomposition? identifying the syntactical errurs and the data types are ‘The query decomposition is the first phase of quer corrected.’ Onco ‘this step is passed, the query is .ssing whose aims are to transform, a high-level, query’ decomposed into”'several smaller blocks of query. into a relational algebra query and to check whether that WME Each of the block is translated into sclation algebra query is syntactically and semantically correct. Thus, a expression, query decomposition phase starts with a high-level query fale - —— - nd transforms into a query graph of low-level operations/qqll\14. What is Query Optimization Issues in DDBMS? (algebraic expressions), which satisfies the query. ; In DDBMS, query. optimization is a crucial task. 10. What is Data Shipping? i ‘The complexity is high since number of alternative strategies may. increase exponentially due to the following In data shipping, the data fragments are transfe1 factors: to the database server, where the operations are executed} (1) The presence of a number of fragments, This is used in operations where the operands. a 2) Distribution of the fragments or tables across various distributed at different sites. This is also appropriate in} sites.” systems where the communication costs are low, and lo (3). The speed of communication links. processors are much slower than the client server. *(4). Disparity in local processing capabilities. Hence, in a distributed system, the target is often to find a good execution strategy for query processing rather than the best one. The time to execute a query is the sum. » of the following | (1)' ‘Time to communicate queries ta databases. (2) Time to execute local query fragments. (3) Time to assemble data from different sites. ’ results to thy application. Data localization takes as input’ the decompo: query on global relations and applies data distributigh information to the query in order to localize its data. Data localization determines which fragment involved in the query and thereby transforms distributed query into a fragment query. inale query can be parallelize Ewecution of @ Exec yeas tio ways. What are Bxecution of a single query wars Q) Intra operation Parallelism 1 refers to the execution of a single que jpatalic] on multiple processors and disks. (2) Interoperation Parallelism : In Inter quel ‘em, different queries or transactions execilf with one another. This form of parall transaction throughput. can incre DistRinuteD DATAMASE Systeus Data Shipping : In data shipping. the data fragments are transferred to the database server, whore the operations are executed. This is used in operations where the operands are distributed at different sites, Thia is also appropriate in systems where the communication ensta are low, and local processors are much slower than the client server. Hybrid Shipping : This is a combination of data and operation shipping. Here, data fragments are transferred to the high-speed processors, where the operation runs. ‘The results are then sent to the client site. Write the sic ‘ note about Distributed quer; optimization. ae How Distributed Query is optimized? 6. + commands | =| Distributed query optimization requires evaluation a large number of query trec~ each of which produce required results of a query. This is primarily due to thg presence of large amount of replicated and fragmented data. Hence, the target is to find an optimal solutioy instead of the best solution. The main issues for distributed query optimization are: @) Optimal utilization of resources in the distributes ©) Query trading (8) Reduction of solution space of the query. Optimal Utilization of Resources in the Distributel System Be A distributed system’ has a number of databi servers in the various sites to perform the operati pertaining to a query. Following are the approaches) yurce utilization able at Project operations. Relators ‘sa Siping ‘Operation Shipaing Hyer Shpaing (Figure) Query Trading : In query trading algorithm for distributed database systems, the controlling/client site for a distributed query is called the buyer and the sites where the local queries execute are called sellers. The buyer formulates a number of alternatives for choosing sellers and for reconstructing the global results. The target of the buyer is to achieve the optimal cost. The algorithm starts with the buyer assigning sub- queries to the seller sites. The optimal plan is created from local optimized query plans proposed by the sellers combined with the communication cost for reconstructing tthe final result. Once the global optimal plan is formulated. the query'is executed, Reduction of Solution Space of the Query : Optimal ee SO that the cost of query and data transfor is reduced. This can be achieved through a set of just as heuristics in centralized systems. the next, how operations shoul Step’ 3 iCode Ge execution, an internal node is exe operand tables are available. The no .. This process conti jot node is executed’ replaced . For instance, consider the followi EMPLOYEE : (Cempib | EName | Salai DEPARTMENT: DNo | Location Example 1: The query considered is as follows : 10 C£Name~Arunkunar EMPLOYEE) ‘The query tree appears as follows : ing example Tempio | Sename = “Arn Kura ©—[anore] (Figure) Step 2 Query Plan Generation : After the query tre uence’ of relational algebra expression. esents the result of a les should be passed aewernenenasnanatd tion displays a subset ¢ cal partition of the tab \ds of a table. Th Syntax in Rela i inbutel ple, let us consider tl onal Algebra: Acie Name)>))8S as consider the following Studen If we want to display the names and courses of ally nal algebra we will use the following. rel: students, expression : ‘SSipi_{Name,Course}{(STUDENT}SS Selection : Selection operation displays a subset of tuples of a table that satisfies certain conditions. This gives a horizontal partition of the table. Syntax in Relational Algebra: ‘SSisigma_{«(Condtions}>)(«{Fable Namie)>)}$S For example, in the Student table, if we want toe display the details of all students who have opted for MCA, course, we will use the following relational algebr expression: $Sisigma_{Course} = {\smallBCA')(STUDENT)}S$_—_ Combination of Projection and _ Selection Operations : For most queries, we need a combination’ o projection and selection operations. There are two ways ‘ write these expressions : (1) Using sequence operations. @ Using rename operation to generate intermediat results. For example, to display names of all femal students of the BCA course : i () Relational algebra expression using sequence 6 of projection and selection isremuTeD DATABASE SySTEMS ta $SipL_{Namo}{\sigma_(Gender = \small "Female" AND \ Course = \émall "BCA"}((STUDENT)))38. (4) Relational algebra expression using rename operation to generate intermediate results ‘$8FemaleBCAStudent Weltarrow \sigma_{Gender_ = \small_ “Female” “AND \ Course = \small "BCA'} {(STUDENT)}SS. ‘$SResul Veltarrow \pl_{Name}(Femalescastudenty)$S Union : If Pis a result of an operation and @ is a result of another operation, the union of P and @ (Sp \cup QS) is the set of all tuples that is either in P or in Q or in both without duplicates. For example, to display all students who are either in Semester 1 or are in BCA course : $$SemtStudent —“Veftarrow \sigma {Semester = 4{(STUDENT)}SS S$SBCAStudent Veftarrow \sigma_(Course = \smat "BCA'}(STUDENT)}$5 SSResuit Veftarrow Sem1Student cup BCAStudentsS Intersection : If P is a result of an operation and Q is a result of another operation, the intersection of P and Q (Sp \cap QS) is the set of all tuples that are in P and Q both. For example, given the following two schemas : | EMPLOYEE PROJECT. Pid [City Department Status. To display the names of all cities where a project located and also an employee resides — ‘$8CityEmp Veftarrow \pi_{City((EMPLOYEE))$S b $$CityProject \leftarrow \pi_{City}{(PROJECT)}$$ sl $$Resull VeRarrow CityEmp \cap CityProjectsS _ Minus : If Pis a result of an operation and @ is a result of another operation, P- Q is the set of all tuples that are in P and not in Q. “For example, to list all the departments which do not | have an ongoing project (projects with status = ongoing) : ‘S$AllDept \leftarrow \pi_{Department}{(EMPLOYEE)}SS S$ProjectDept \leftarrow \pi {Department} (\sigma_{Status = | \sriall“ongoing’)(PROJECT))$S $$Resultleftarrow AllDept - ProjectDeptss | Join : Join operation combines related tuples of two " different tables (results of queries) into a single table. two schemas, Customedy For example, consider ab CUSTOMER Custio_[ASeNOT 1 BRANCH. IFSCeode What is Transparencies in distributed database? BranchiO | BranchName [IFS x : ‘Tolist the employee details along with branch actaall (2022-23) s$Result Veftarrow nchib)(BRANCH}$S) na ‘Transparency in DDBMS refers to the transparent distribution of information to the user from the system. It helps in hiding the information that is to be implemented by the user. For example, in a normal DBMS, data independence is a form’ of transparency that helps in hiding changes in the definition & organization of the data from the user, But, théy all have the same overall target. ‘That means’to make’ use of the distributed database the same as a centralized database. ‘Types of Distributed Database Management System : Distributed Database Management System, there are four types of transparencies, which are as follows (1) ‘Transaction Transparency : This transparency makes sure “that all. the transactions that are fe distributed preserve. distributed database integrity i and regularity. Also, it is to understand that distribution transaction access-is the data stored at multiple locations. Another thing to notice is that the DDBMS is responsible for maintaining the atomicity of every sub-transaction (By this, we mean that either the whole transaction takes place directly or doesn't happen in the least). It is very complex due to the use of fragmentation, allocation, and replication structure of DBMS. Performance Transparency : This transparency requires a DDBMS to work in a way that if it is a \ Me. centralized database management system. Also, the \ system should not undergo any downs in performance as its architecture is distributed. Likewise, a DDBMS | \wowtie_{Customer-BranchiD=[Link] must have a distributed query processor which can map.a data request into an ordered sequence of \ y ‘operations on the local database. ‘This has another | complexity to take under consideration which is the agmentation, replication, and allocation structure of | DBMS. fo @ DBI ye the Transaction. (2022-23) $ Define Transaction? nd data models) A transaction can be defined f 1s) ‘ansaction can be defined as a group of tasks. A DBMS may be diffe; single task is the minimum provessing unit which cannot the most compli be divided further. . nake use of as 2 generalization, Let's take an example of a simple transaction. See Distribald Suppose a bank employee transfers % 500 from A's account nsparency o cee nenee to B's account, This very simple and small transaction involves several low-level tasks. as a single thing or a logical entity, and if a DDBI MeAwmine Gisplays distribution data transparency, then Oren ces not neod to Know that the data Olt Balen = balance nnted. Distribution transparency has. its New Balance = Old Balance - 500 types. which are discussed below : [Link] = New Balance (a) Access transparency Close_Account(A) . (b) Location transparency Bis Account : (c) Concurrency transparency open Aeoant e falance = B balance ° picnics New Balance = Old Balance + 500 [Link] = New_Balance Close_Account(B) 2. Differentiate between Shared lock & Exclusive lock. (2022-23) | |. @ Explain ACID properties with an example. : = (2022-23) Eee NT ee RTT PST TT Define ACID properties with suitable examples. Shared Lock Exclusive Lock Used Tor read-only|Used for read and wiite a loperations loperations (1) A-Atomicity IMutipie transactions can|Oniy one Wansaclon Gan @) C-Consisteney hold a shared lock on thelhoid an exclusive lock| (3), Flsolation same data item Jon a data item F (4) D-Durability : locks alow[Exclusive locks allow] (1) Atomicity : The term stomicity defines that the data Consistency Jmuttiple transactions [Link] transaction to} remains atomic. It means if. any oferation is 8d @ resource, but nonelaccess a resource for performed on the data, either it should be performed 5 modhty reading or writing or executed completely or should not be executed at [Shared Tocks are nol all. It further means that the operation should not with any break in between or execute partially. In the case of executing operations on the tr operation should be completely exec have “bee Tocks have’ al partiall OTE time thanllonger wait time. than| Example : Here, the set of operations are shared locks. exclusive locks (3) Isolation w@ amount in Bob fo add account, revel Alice’ ice’s account | cry attribute in the database: Tec to ensure the stability of the databai “ynetraint puts on the data value should the execution of fe doing an operation, revert’ back the syste revious state Example : The total amount in Alice’s and Bol account should be the same before and after ‘th transaction. The sum of the money in Alice and Bob account before and after the transaction is $20 Ee) his transaction preserves consistency AC roperties in DBMS. : If you. are. performing | multip! (4) transactions on the single database, operation from any transaction should not interfere with operation inj other transactions. the execution of all transactions should be isolated from other transactions.) Example : If there is any other. transaction | (between Mac and Alice) going, it should-not’make any effect on the transaction between Alice and Bob. J Both the transactions should be isolated. Durability : All the above three properties should be @ . Example : It may happen. A system gets crashed after completion of all the operations. If the system restarts it should preserve the stable state. An amount in Alico and Bob's account should be the same before and after the system gets a restart. Draw transaction state diagram. (2022-23) Explain Transaction States with diagram. A transaction may go through a subset of five states, nt before and after active, partially committed, committed, failed and” ude aborted. he system fails because of the invalid -d (1). Active : ‘The initial state where the transaction enters is the active state. The transaction remains in this state while it is executing read, write or other operations. Partially Committed : The transaction enters this state affer the last statement of the transaction has been executed. Committed : The transaction enters this state after, Sucéessful completion of the transaction and system “checks have issued commit signal. Failed : The transaction goes from partially committed state or active state to failed state when it is discovered that normal execution can no longer proceed o system checks fail. ‘Aborted : This is the state after the transaction has been rolled back after failure and the database has been. restored to its state that was before the transaction began. ‘The following state transition diagram depicts the states in the transaction and the low level transaction operations that causes change in states. satisfied while the transaction in progress. Bug durability issues can happen even after. the completion of the transaction. So this is the. ACI Property After Completion of Transaction, ©. \\/, ‘The changes made during the transaction shot exist after completion of the transaction. Sometimes it may happen as all the operation the transaction completed but the system fal ‘begin_ransaction immediately. In that case, changes’, made while transactions should persist. ‘The system ‘shot return to its previous stable stat (Figure) fc.7) nn Manager? a 7 rnetions of Transactio a What are the fi WRITE (Ay READA(B) ct READ ViRTEZ (6) EADIE) CZ | Non Serial Schudule ¢ Whon w trannaction i cvorlapped won the transaction 1") and 2, amply : Com « following oxtmplo pi Se READIN, White 1(r) TAHT WHRITE(0) i READA(b) Te ‘Iypon of petal bility + There aro two types of lizability if 71 in the re rondy tho 7. © Dineusn the nertulizability, © Explain view nertalizubllity, (2021-22) © What in Dintributed werializubllity? writo operation Sorlalizability : unrreney Cong ie oy! by eome other trananction. cated by a cycle in the wait-for- ! 8 directed ich od ea ted graph in which the lena remes algorithm ist transactions and the edyen denote ep oa nie oe “stamp Ordering Protocol. tranee tt, example, in tho. following wait for graph, Oy TT toe, Waiting for data item X which is locked Y 13, 73 is waiting for ¥ which is locked by 12 and 72 ta + ionsien crontab waiting for 2 which ig TI. Hence, a waiting cycl esa lion ta |. Hence, a waiting cycle at eliotead transactions can proceed ia nsaction, h they are submitted to th mp of a transaction 7" al vou may refer here. 1 ‘idea for this protocol is to order the based on their Timestamps. A schedule + nsactions participate is then serializabl | schedule permitted has thé n the order of their Timestamp Values! < simply, the schedule is equivalent to thej cular Serial Order corresponding to the ordér of the! handling, namely : saction timestamps. Algorithm must ensure that, for 3 Ge Dodiiock prevention, each items accessed by Conflicting’ Operations in ‘the (2). Dea keiidena schedule, the order in which the item is accessed does not (3) Dee desta ea iolate the ordoring pears this, une to Timestamp All of the three approaches can be incorporated ia Berea are both a centralized and a distributed database system. 3 (Figure) ELE RSTS Te a RE eT ee ETP There are three classical approaches for deadlock ~ GQ) WTS) is the largest timestamp of any) Deadlock Prevention : The deadlock preventios transaction that executed write(X) successfully. approach, doge not allow any suite locks a 02 allow any transaction to acquire locks (2) B_TS(X) is the largest timestamp of any transactio that will lead to deadlocks. The convention is that when that executed read(X) successfully. more than one transactions request for locking the same - - al data item, only one of them is granted the lock. : ing. Di various) * Explain, deadtock Dene on 2022-28) | One of the nose pose deadlock prevention methods lock avoida - (2022-25) is pre-aequisition of all the locks. In this method. a Lompare Distelouted, ‘Deadlock, pene in transaction acquires all the locks before starting to execute istributed Deadlock Avoidance. Explain on and retains the locks for the entire duration of transaction. If another transaction needs any of the already acquired locks, it has to, wait until all the locks it needs are available. Using ‘this approach, the system is prevented from being deadlocked since none of the ‘waiting transactions are holding any lock. Deadlock Avoidance : The deadloc handles. deadlocks beforr taey oce scheme of Distributed deadlock Detection an Recovery. (2021-22)4 What is deadlock? Write the three classical; approaches for deadlock handling. voidance approach ~ It analyzes the Deadlock is a state of a database system havin, or more transactions, when the locl reaction ca is “Toeked by some othe de, the lock manager ru fest whether keeping the transaction) tise a deadlock or not. Accordingly, et action can wait or 07 ck. However n ineampatible m0 ios whel Jhould be aborted. \corithms for this purpose, namel a+ Let us assume that there are t\ ne. Tl and T2, where 71 tries to lock a data item ; already locked by 72. The algorithms are ag} follows (1) Wait-Die : If TI is older than 72, Tl. [Link] wait, Otherwise, if 71 is younger than 12, T1 is} aborted and later restarted. Wound-Wait : If 71 is older than 72,’ T2,is aborted and later restarted. Otherwise, if T1 is younger than, T2, T1 is allowed to wait. Deadlock Detection and Removal : The deadlock detection and removal approach runs a deadlock detection algorithm periodically and removes deadlock in case there, is one. It does not check for deadlock when a transaction{ places a request for a lock, When a transaction requests a lock, the lock manager checks whether it is available. If it is available, the transaction is allowed to lock the dat item; otherwise the transaction is allowed to wait. ‘Since there are no precautions while granting lock requests, some of the transactions may be deadlocked. ‘7 detect deadlocks, the lock manager periodically cheel the wait-forgraph has eycles. If the system is deadlocke the lock manager chooses a victim transaction from cycle. The vietim is aborted and rolled back; and restarted later. Some of the methods used for’ victiy selection are : . 3 (2) Choose the youngest transaction. (2) Choose the transaction with fewest data items. @ DrstRInUtED DATANASE BrsteNs, tos, (8) Choose the tranan number of updates, (4) Choose the trans; (8) Choose the tran: more cycles, His approach is primarily suited for systema having Hransattions low and where fast response to lack requests Deadlock Handling in Distribute : also distributed, ie. the same transaction may be processing at more than one site. The two main deadlock handling concerns in a distributed database on the log recoril at its site. | (2) a the -coordinator(C;)sends a Prepare ‘ A miodsage to all the sites where the transaction executed, yet response. so it must. commit, sady. T>. and local Transaction manay T to C,. Once the read ng can prevent a of nsaction 7 excef portion Coordinator (Figure : Messaging in Phase- 1st) Phase- 2nd : The Second phase started as the respo! abort 7 or commit T receives by the coordinator (C;) fro} a all the sites that are collaboratively executing tht transaction 7. However, it is possible that some site fa to respond: y be down, or it has been disconnected. the network. In that case, after‘a suitable timeout perig Il be after that time it will treat the site as if had cent abort T. The fate of the transaction depél upon the following points : () If the coordinator receives ready 7 from all’ i Participating sites of 7, then it decides to commit} Then, the coordinator writes on its site log re ommit T> and sends a message commit 7’ sites involved in 7. =_ en Ee eg arate e tea If 0 site roeoiy em a commit T mes Tin tie toda Wesango, it commits tho Fenent of Tat that alte, and write it in 16 records . (A) Howover, if the coordinator has received abort T from one or more sites, it logs at its site "and. then sends, abort 1’ messages to all sites involved in transaction 7. Disadvantages ()/ The major’ disadvantage of the Two-phase commit Protocol is faced when the Coordinator site failure may result in blocking, so a decision either to ‘commit or abort. ‘Transaction(7) may have to be “postponed until coordinator recovers. “ (2) Blocking Problem : Consider a scenario, if a ‘Transaction(7) holds locks on data-items of active sites, but amid the executior;, if the coordinator fails and ithe active sites keep no additional log-record except like or . So, it ‘becomes impossible to determine what decision has been mado(whether to /). So, In that case, the final decision is delayed until the Coordinator is restored or fixed. In some cases, this ‘may take a day or long hours to restore and during this time. period, the locked data items remain inaccessible for other transactions(Ti). This problem is known as Blocking Problem. aborts T from ain incorreet oni backward recovery. It able Schedule +A sel wwerable a8 wo eat rite operation, on aystom to a previous, ch instanco when it has entored an 3 effort is made to place tho aystom in state from which it can_con to operat error techniques is that potential errors anticipated in advance, Only then is it feasible to change those mistakes and transfer to a new state. no read or on of transaction: @ salled cascadeless sched schedule. Why is recovery in a distributed DBMS more complicated than in a centralized system? (2021-22) 12. Define Moss Concurrency protocol? This is a protocol which is used to control tH concurrency in the Distributed database envionment. It mainly used for handling the nested (hierarchical In order to’ recuperate from database failure, transactions which are based on inheritance. 5 database management systems resort to a number of Consider a Transaction (T) acquire a lock on recovery management techniques. The typical strategies data-item(S) in some mode (M). The Transaction(T) hold for database recovery are : the lock in mode(M) until it terminates. When anyone s (Q) In case of soft failures that result in inconsistency of transaction(T1) of T commits, then its parent Transactig database, recovery strategy includes transaction undo ‘occupies or inherits that lock and retains until all & or rollback. However, sometimes, transaction redo transaction can't finish. If a transaction holds a lock oni may also be adopted to recover to a consistent state of data-item(X) so, it has the right to access the locld the transaction, data-item(X) in the corresponding mode. However, it is (2) Incase of hard failures resulting in extensive damage valid in case if a transaction retained a lock from any otf | to database, recovery strategies encompass restoring some subtransaction(descendant).A retained lock is onl} a past copy of the database from archival backup. A just a kind of placeholder and indicates that J more current state of the database is obtained subtransactions that are out of corresponding hiers through redoing operations of committed transactions can't acquire the lock, but descendant can acquire thé 1g from transaction log. As soon as a transaction becomes a retainer off Recovery from Power Failure : Power failure causes loss of information in the non-persistent. memory. When power is restored, the operating system and the database management system restart. Recovery manager initiates Tecovery from the transaction logs. In case of immediate update mode, the recovery ‘manager takes the following actions ward Recovery + Moving the system nto a formerly accurate ep) Se apeemeusetnamiitemtamtaa fez ions which are-on the int are undone or redone. left. side of the last committed and: needn't taken for checkpointing are Fuzzy Cheekpointing : In fuzzy checkpoi time’ of checkpoint, all the active transactions are written te in the log. In case of power failure, the recovery manager ‘e active list and. fat processes only those transactions that were active duising jist, checkpoint and later. ‘The transactions that have boon committed before checkpoint are written to the disk and hence need not be redone. Transaction Recovery Using UNDO / REDO : ‘Transaction recovery is done to éliminate the adverse effects of faulty transactions rather than to recover from a failure. Faulty transactions include all transactions that have changed the database into undesired state and the tae = at al tos, To ecsver froin th transactions that have used values written by the faulty crash causes a to se loss. To transactions. . i . the operatin| : hand cath, pes a : & ene a Ps a tise jon recovery in these cases is a two-step . restored, an 88 2 ising the database backup ona franiseop es «) @) ‘UNDO all faulty transactions and transactions that recovery method is same for both immediate anc 4 may be affected by the faulty transactions. ‘update modes Soe es (2) REDO all transactions that are not faulty but have recovery manager takes wing actions jone du ransactions. (2) The transactions in the commit list ete neing ae fa to faulty trated before-commit list are redone and written onto thegillli15, uss the’. issues to achieve atomicity commit list in the transaction log. distributed transaction management system. (2) The transactions in the active (2021-22) Tight side of the last ‘The transactions to the ynsistent. checkpoint aro already Processed again. The actions ing, at the ¢ ‘action is taken for transactions in commit or ab transaction log. Distributed transaction refers to the transaction in Checkpointing : Checkpoint is a point of time at which @ which multiple servers are-involved. Multiple servers are record is written onto the database from the buffers. Ai called by a client in Simple Distributed Transaction consequence, in case of a system cra whereas a server calls another server in Nested ger does not have. to redo the transactions ‘Transaction. The execution of a transaction at many "© been committed before checkpoint. Periodid must either be committed at all sites or aborted at all sites. checkpointing shortens the recovery process. But this should not be the case that transaction is ‘pes of checkpointing techniques are committed at one site and aborted at another site. (1) Consistent checkpointing - Distributed site systems use distributed commitm (2) Fuzzy checkpointing i rule to ensure atomicity across sites. Atomic commitment (3) Consistent Checkpointing is a channel of need for coop: n across a variety of systems, ig creates a consistent it. During recovery, only ost workora, and core, mak sic anh 08.8 decision to ensure trans ‘ole of Worl, ora; Coordinator's existonoy oka their outcome coordinator’s decision, ‘Atomic Commit : ‘The atomic m manage tions 1 transaction as needed. : MUI data changes ave treated as if thé were a single operation. That is, either all of; “ations are made, or none of them are madi omicity feature assures that if a debit fully made from one account, the: matching credit is made to the other account in an applicati ansfrstoney from one acrount to andthe (2) Consistency : This property implies that wl en, all partici v¢ 2 transaction begins and ends, the state of’ data| Distributed ‘OnePhace Cos consistent. For example, it ensures that the valu commitment. protocol invoh een remains consistent at the start and end of @ 2S conclusion (2). If any participant docides Participants must have voto. to commit, then all other d yes. from one account to another. 7 Isolation : In this property, concurrently unning transactions appear to be serialized. For example, the, isolation property assures that the transferred funds | between two accounts can be seen:'by another transaction in either one of the accounts, but not, both, or neither. Durability : Changes to data persist when aj transaction completes successfully and are not] foo undone, even if the system fails. It assures thai ; igure) modifications made to each account will not. bé Patributed Two-Phase Commit: There are two phases reversed in an application that transfers money fr Phase teVoliser fee oe one account to another. ~ = e2fe Coordination in Distributed Transactions : Att Oe ate wacuaaies ore ee time of coordination in Distributed Transactions, one of t (2) Tho coordinator must wait until a response whether servers becomes a coordinator, and the rest of the worl Salt pr ROC Teddi ia voonived. thom cara werden ora, become coordinators. “timeout occurs. () Ina simple transaction, Workers must wait until the coordinator sends the Coordinator. “prepare” message. @) @ participant (3) the first server acts as tl (2) In the nested transaction, the top-level server acts’ (4) If a transaction is ready to commit then a “ready” the Coordinator. ao message is sent to the coordinator. a (3) Role of Coordinator : The coordinator keeps tr: (6) If a transaction is not ready to commit then a “no message is, sent to the ceordinator and resulting in of participating se fatalattid aborting of the transact servers, gathers results. fr

Vous aimerez peut-être aussi