Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
This report describes the design of a Replication Framework that facilitates the implementation and comparison of database replication techniques. Furthermore, it discusses the implementation of a Database Replication Prototype and compares the performance measurements of two replication techniques based on the Atomic Broadcast communication primitive: pessimistic active replication and optimistic active replication.
2002
This paper explores the architecture, implementation and performance of a wide and local area database replication system. The architecture provides peer replication, supporting diverse application semantics, based on a group communication paradigm. Network partitions and merges, computer crashes and recoveries, and message omissions are all handled. Using a generic replication engine and the Spread group communication toolkit, we provide replication
IEEE Transactions on Knowledge and Data Engineering, 2005
In this paper, we present a performance comparison of database replication techniques based on total order broadcast. While the performance of total order broadcast-based replication techniques has been studied in previous papers, this paper presents many new contributions. First, it compares with each other techniques that were presented and evaluated separately, usually by comparing them to a classical replication scheme like distributed locking. Second, the evaluation is done using a finer network model than previous studies. Third, the paper compares techniques that offer the same consistency criterion (one-copy serializability) in the same environment using the same settings. The paper shows that, while networking performance has little influence in a LAN setting, the cost of synchronizing replicas is quite high. Because of this, total order broadcast-based techniques are very promising as they minimize synchronization between replicas.
Database and Expert Systems …, 1996
In this paper we investigate the performance issues of data replication in a loosely coupled distributed database system, where a set of database servers are connected via a network. A database replication scheme, Replication with Divergence, which allows some degree of divergence between the primary and the secondary copies of the same data object, is compared to other two schemes that, respectively, disallows replication and maintains all replicated copies consistent at all times. The impact of some tunable factors, such as cache size and the update propagation probability, on the performance of Replication with Divergence is also investigated. These results shed light on the performance issues that were not addressed in previous studies on replication of distributed database systems.
In database replication, primary-copy systems sort out easily the problem of keeping replicate data consistent by allowing only updates at the primary copy. While this kind of systems are very efficient with workloads dominated by read-only transactions, the update-everywhere approach is more suitable for heavy update loads. However, this approach adds a significant overload when working with readonly transactions. We propose a new database replication paradigm, halfway between primary-copy and update-everywhere approaches, which permits improving system performance adapting its configuration to the workload, thanks to a deterministic database replication protocol which ensures that broadcast writesets are always going to be committed.
Database replication protocols based on group communication primitives have recently emerged as a promising technology to improve database faulttolerance and performance. Roughly speaking, this approach consists in exploiting the order and atomicity properties provided by group communication primitives or, more specifically Atomic Broadcast, to guarantee transaction properties. This paper proposes a systematic classification of non voting database replication algorithms based on Atomic Broadcast.
Proceedings 19th IEEE Symposium on Reliable Distributed Systems SRDS-2000, 2000
Data replication is an increasingly important topic as databases are more and more deployed over clusters of workstations. One of the challenges in database replication is to introduce replication without severely affecting performance. Because of this difficulty, current database products use lazy replication, which is very efficient but can compromise consistency. As an alternative, eager replication guarantees consistency but most existing protocols have a prohibitive cost. In order to clarify the current state of the art and open up new avenues for research, this paper analyses existing eager techniques using three key parameters. In our analysis, we distinguish eight classes of eager replication protocols and, for each category, discuss its requirements, capabilities, and cost. The contribution lies in showing when eager replication is feasible and in spelling out the different aspects a database replication protocol must account for.
Proceedings of the 2009 EDBT/ICDT Workshops on - EDBT/ICDT '09, 2009
In distributed systems, replication is used for ensuring availability and increasing performances. However, the heavy workload of distributed systems such as web2.0 applications or Global Distribution Systems, limits the benefit of replication if its degree (i.e., the number of replicas) is not controlled. Since every replica must perform all updates eventually, there is a point beyond which adding more replicas does not increase the throughput, because every replica is saturated by applying updates. Moreover, if the replication degree exceeds the optimal threshold, the useless replica would generate an overhead due to extra communication messages. In this paper, we propose a suitable replication management solution in order to reduce useless replicas. To this end, we define two mathematical models which approximate the appropriate number of replicas to achieve a given level of performance. Moreover, we demonstrate the feasibility of our replication management model through simulation. The results expose the effectiveness of our models and their accuracy.
2005
Abstract Data replication is a key technology in distributed systems that enables higher availability and performance. This article surveys optimistic replication algorithms. They allow replica contents to diverge in the short term to support concurrent work practices and tolerate failures in low-quality communication links. The importance of such techniques is increasing as collaboration through wide-area and mobile networks becomes popular.
International Journal of Cyber and IT Service Management, 2022
Today's computer applications have ever-increasing database system capabilities and performance. The growing amount of data that has to be processed in a business company makes centralized data processing ineffective. This inefficiency shows itself as a long reaction time. This is in direct opposition to the purpose of utilizing databases in data processing, which is to reduce the amount of time it takes to process data. Another database design is required to tackle this problem. Distributed database technology refers to an architecture in which several servers are linked together, and each one may process and fulfill local queries. Each participating server is responsible for serving one or more requests. In a multi-master replication scenario, all sites are main sites, and all main sites communicate with one another. The distributed database system comprises numerous linked computers that work together as a single system.
Proceedings 20th IEEE International Conference on Distributed Computing Systems, 2000
Replication is an area of interest to both distributed systems and databases. The solutions developed from these two perspectives are conceptually similar but differ in many aspects: model, assumptions, mechanisms, guarantees provided, and implementation. In this paper, we provide an abstract and "neutral" framework to compare replication techniques from both communities in spite of the many subtle differences. The framework has been designed to emphasize the role played by different mechanisms and to facilitate comparisons. With this, it is possible to get a functional comparison of many ideas that is valuable for both didactic and practical purposes. The paper describes the replication techniques used in both communities, compares them, and points out ways in which they can be integrated to arrive to better, more robust replication protocols.
2011
The necessity of ever-increasing use of distributed data in computer networks is obvious for all. One technique that is performed on the distributed data for increasing of efficiency and reliablity is data rplication. In this paper, after introducing this technique and its advantages, we will examine some dynamic data replication. We will examine their characteristies for some overus scenario and the we will propose some suggestion for their improvement.
Lecture Notes in Computer Science
Database replication protocols based on a certification approach are usually the best ones for achieving good performance when an eager update everywhere technique is being considered. The weak voting approach achieves a slightly longer transaction completion time, but with a lower abortion rate. So, both techniques can be considered as the best ones for eager replication when performance is a must, and both of them need atomic broadcast. We propose a new database replication strategy that shares many characteristics with such previous strategies. It is also based on totally ordering the application of writesets, using only an unordered reliable broadcast, instead of an atomic broadcast. Additionally, the writesets of transactions that are aborted in the final validation phase need not be broadcast in our strategy. Thus, this new approach always reduces the communication traffic and also achieves a good transaction response time (even shorter than those previous strategies in some system configurations).
International Journal of Computer Applications, 2013
In this paper, a database replication algorithm is presented. The main idea is to reduce the latency in database replication while maintaining very high throughput. The main points of the algorithm are detailed in this paper. Simulation results are presented to evaluate throughput and average delay. For the in-depth analysis of the algorithm, various cases are considered and it has been found that if the numbers of servers that can serve requests are larger in number then throughput is very high with very less average delay. The overall, throughput and average delay also depend heavily on the load and if load is comparatively less (< 0.8) then the throughput is very high and average delay is nearly zero.
Database replication protocols based on a certification approach are usually the best ones for achieving good performance when an eager update everywhere technique is being considered. The weak voting approach achieves a slightly longer transaction completion time, but with a lower abortion rate. So, both techniques can be considered as the best ones for eager replication when performance is a must, and both of them need atomic broadcast. We propose a new database replication strategy that shares many characteristics with such previous strategies. It is also based on totally ordering the application of writesets, using only an unordered reliable broadcast, instead of an atomic broadcast. Additionally, the writesets of transactions that are aborted in the final validation phase need not be broadcast in our strategy. Thus, this new approach always reduces the communication traffic and also achieves a good transaction response time (even shorter than those previous strategies in some system configurations).
ISPRS International Journal of Geo-Information, 2020
This paper is focused on comparing database replication over spatial data in PostgreSQL and MySQL. Database replication means solving various problems with overloading a single database server with writing and reading queries. There are many replication mechanisms that are able to handle data differently. Criteria for objective comparisons were set for testing and determining the bottleneck of the replication process. The tests were done over the real national vector spatial datasets, namely, ArcCR500, Data200, Natural Earth and Estimated Pedologic-Ecological Unit. HWMonitor Pro was used to monitor the PostgreSQL database, network and system load. Monyog was used to monitor the MySQL activity (data and SQL queries) in real-time. Both database servers were run on computers with the Microsoft Windows operating system. The results from the provided tests of both replication mechanisms led to a better understanding of these mechanisms and allowed informed decisions for future deployment. Graphs and tables include the statistical data and describe the replication mechanisms in specific situations. PostgreSQL with the Slony extension with asynchronous replication synchronized a batch of changes with a high transfer speed and high server load. MySQL with synchronous replication synchronized every change record with low impact on server performance and network bandwidth.
Database replication protocols based on a certification approach are usually the best ones for achieving good performance. The weak voting approach achieves a slightly longer transaction completion time, but with a lower abortion rate. So, both techniques can be considered as the best ones for replication when performance is a must, and both of them take advantage of the properties provided by atomic broadcast. We propose a new database replication strategy that shares many characteristics with such previous strategies. It is also based on totally ordering the application of writesets, using only an unordered reliable broadcast, instead of an atomic broadcast. Additionally, the writesets of transactions that are aborted in the final validation phase are not broadcast in our strategy. Thus, this new approach reduces the communication traffic and also achieves a good transaction response time (even shorter than those previous strategies in some system configurations).
2000
In this paper, we explore data replication protocols that provide both fault tolerance and good performance without compromising consistency. We do this by combining transactional concurrency control with group communication primitives. In our approach, transactions are executed at only one site so that not all nodes incur in the overhead of producing results. To further reduce latency, we use an optimistic multicast technique that overlaps transaction execution with total order message delivery. The protocols we present in the paper provide correct executions while minimizing overhead and providing higher scalability.
2008
Database replication has been researched as a solution to overcome the problems of performance and availability of distributed systems. Full database replication, based on group communication systems, is an attempt to enhance performance that works well for a reduced number of sites. If application locality is taken into consideration, partial replication, i.e. not all sites store the full database, also enhances scalability. On the other hand, it is needed to keep all copies consistent. If each DBMS provides SI, the execution of transactions has to be coordinated so as to obtain Generalized-SI (GSI). In this paper, a partial replication protocol providing GSI is introduced that gives a consistent view of the database, providing an adaptive replication technique and supporting the failure and recovery of replicas.
Proceedings IEEE International Symposium on Network Computing and Applications. NCA 2001
Enterprise information systems are nowadays commonly structured as multi-tier architectures and invariably built on top of database management systems responsible for the storage and provision of the entire business data. Database management systems therefore play a vital role in today's organizations, from their reliability and availability directly depends the overall system dependability. Replication is a well known technique to improve dependability. By maintaining consistent replicas of a database one can increase its fault tolerance and simultaneously improve system's performance by splitting the workload among the replicas. In this thesis we address these issues by exploiting the partial replication of databases. We target large scale systems where replicas are distributed across wide area networks aiming at both fault tolerance and fast local access to data. In particular, we envision information systems of multinational organizations presenting strong access locality in which fully replicated data should be kept to a minimum and a judicious placement of replicas should be able to allow the full recovery of any site in case of failure. Our research departs from work on database replication algorithms based on group communication protocols, in detail, multi-master certification-based protocols. At the core of these protocols resides a total order multicast primitive responsible for establishing a total order of transaction execution. A well known performance optimization in local area networks exploits the fact that often the definitive total order of messages closely following the spontaneous network order, thus making it possible to optimistically proceed in parallel with the ordering protocol. Unfortunately, this optimization is invalidated in wide area networks, precisely when the increased latency would make it more useful. To overcome this we present a novel total order protocol with optimistic delivery for wide area networks. Our protocol uses local statistic estimates to independently order messages closely matching the definitive one thus allowing optimistic execution in real wide area networks. Handling partial replication within a certification based protocol is also particularly challenging as it directly impacts the certification procedure itself. Depending on the approach, the added complexity may actually defeat the purpose of partial replication. We devise, implement and evaluate two variations of the Database State Machine protocol discussing their benefits and adequacy with the workload of the standard TPC-C benchmark. Permitir que protocolos de replicação de bases de dados baseados em certificação suportem replicação parcial coloca vários desafios que afectam directamente a forma com é executado o procedimento de certificação. Dependendo da abordagem à replicação parcial, a complexidade gerada pode até comprometer os propósitos da replicação parcial. Esta tese concebe, implementa e avalia duas variantes do protocolo da database state machine com suporte para replicação parcial, analisando os benefícios e adequação da replicação parcial ao teste padronizado de desempenho de bases de dados, o TPC-C. Recently several research efforts have been developed in order to combine protocols from both communities. These efforts result in group based replication protocols [SR96,
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2010
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.