Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2000, Proceedings 20th IEEE International Conference on Distributed Computing Systems
Replication is an area of interest to both distributed systems and databases. The solutions developed from these two perspectives are conceptually similar but differ in many aspects: model, assumptions, mechanisms, guarantees provided, and implementation. In this paper, we provide an abstract and "neutral" framework to compare replication techniques from both communities in spite of the many subtle differences. The framework has been designed to emphasize the role played by different mechanisms and to facilitate comparisons. With this, it is possible to get a functional comparison of many ideas that is valuable for both didactic and practical purposes. The paper describes the replication techniques used in both communities, compares them, and points out ways in which they can be integrated to arrive to better, more robust replication protocols.
Proceedings 19th IEEE Symposium on Reliable Distributed Systems SRDS-2000, 2000
Data replication is an increasingly important topic as databases are more and more deployed over clusters of workstations. One of the challenges in database replication is to introduce replication without severely affecting performance. Because of this difficulty, current database products use lazy replication, which is very efficient but can compromise consistency. As an alternative, eager replication guarantees consistency but most existing protocols have a prohibitive cost. In order to clarify the current state of the art and open up new avenues for research, this paper analyses existing eager techniques using three key parameters. In our analysis, we distinguish eight classes of eager replication protocols and, for each category, discuss its requirements, capabilities, and cost. The contribution lies in showing when eager replication is feasible and in spelling out the different aspects a database replication protocol must account for.
International Journal of Cyber and IT Service Management, 2022
Today's computer applications have ever-increasing database system capabilities and performance. The growing amount of data that has to be processed in a business company makes centralized data processing ineffective. This inefficiency shows itself as a long reaction time. This is in direct opposition to the purpose of utilizing databases in data processing, which is to reduce the amount of time it takes to process data. Another database design is required to tackle this problem. Distributed database technology refers to an architecture in which several servers are linked together, and each one may process and fulfill local queries. Each participating server is responsible for serving one or more requests. In a multi-master replication scenario, all sites are main sites, and all main sites communicate with one another. The distributed database system comprises numerous linked computers that work together as a single system.
2006
The availability of information systems benefits from distributed replication. Different applications and users require different degrees of data availability, consistency and integrity. The latter are perceived as competing objectives that need to be reconciled by appropriate replication strategies. We outline work in progress on a replication architecture for distributed information systems. It simultaneously maintains several protocols, so that it can be re-configured on the fly to the actual needs of availability, consistency and integrity of possibly simultaneous applications and users
ISPRS International Journal of Geo-Information, 2020
This paper is focused on comparing database replication over spatial data in PostgreSQL and MySQL. Database replication means solving various problems with overloading a single database server with writing and reading queries. There are many replication mechanisms that are able to handle data differently. Criteria for objective comparisons were set for testing and determining the bottleneck of the replication process. The tests were done over the real national vector spatial datasets, namely, ArcCR500, Data200, Natural Earth and Estimated Pedologic-Ecological Unit. HWMonitor Pro was used to monitor the PostgreSQL database, network and system load. Monyog was used to monitor the MySQL activity (data and SQL queries) in real-time. Both database servers were run on computers with the Microsoft Windows operating system. The results from the provided tests of both replication mechanisms led to a better understanding of these mechanisms and allowed informed decisions for future deployment. Graphs and tables include the statistical data and describe the replication mechanisms in specific situations. PostgreSQL with the Slony extension with asynchronous replication synchronized a batch of changes with a high transfer speed and high server load. MySQL with synchronous replication synchronized every change record with low impact on server performance and network bandwidth.
Database and Expert Systems …, 1996
In this paper we investigate the performance issues of data replication in a loosely coupled distributed database system, where a set of database servers are connected via a network. A database replication scheme, Replication with Divergence, which allows some degree of divergence between the primary and the secondary copies of the same data object, is compared to other two schemes that, respectively, disallows replication and maintains all replicated copies consistent at all times. The impact of some tunable factors, such as cache size and the update propagation probability, on the performance of Replication with Divergence is also investigated. These results shed light on the performance issues that were not addressed in previous studies on replication of distributed database systems.
2002
Replication maintains replicas (copies) of critical data on multiple computers and allows access to any one of them. It is the critical enabling technology of distributed services, improving both their performance and availability. Availability is improved by allowing access to the data, even when some of the replicas or network links are unavailable. Performance is improved in two ways. First, users can access nearby replicas and avoid expensive remote network access and reduce latency.
2013
The present invention concerns improvements relating to database replication. More specifically, aspects of the present invention relate to a fault-tolerant node and a method for avoiding non-deterministic behaviour in the management of synchronous database systems.
This report describes the design of a Replication Framework that facilitates the implementation and comparison of database replication techniques. Furthermore, it discusses the implementation of a Database Replication Prototype and compares the performance measurements of two replication techniques based on the Atomic Broadcast communication primitive: pessimistic active replication and optimistic active replication.
Abstract Recently, third party solutions for database replication have been enjoying an increasing popularity. Such proposals address a diversity of user requirements, namely preventing conflicting updates without the overhead of synchronous replication; clustering for scalability and availability; and heterogeneous replicas for specialized queries.
Innovations and Advances in Computer, Information, …
Distributed databases are an optimal solution to manage important amounts of data. We present an algorithm for replication in distributed databases, in which we improve queries. This algorithm can move information between databases, replicating pieces of data through databases.
2011
The necessity of ever-increasing use of distributed data in computer networks is obvious for all. One technique that is performed on the distributed data for increasing of efficiency and reliablity is data rplication. In this paper, after introducing this technique and its advantages, we will examine some dynamic data replication. We will examine their characteristies for some overus scenario and the we will propose some suggestion for their improvement.
International Journal of Current Microbiology and Applied Sciences, 2019
In database replication, primary-copy systems sort out easily the problem of keeping replicate data consistent by allowing only updates at the primary copy. While this kind of systems are very efficient with workloads dominated by read-only transactions, the update-everywhere approach is more suitable for heavy update loads. However, this approach adds a significant overload when working with readonly transactions. We propose a new database replication paradigm, halfway between primary-copy and update-everywhere approaches, which permits improving system performance adapting its configuration to the workload, thanks to a deterministic database replication protocol which ensures that broadcast writesets are always going to be committed.
Bokhari, Syed Mohtashim Abbas, and Oliver Theel. "A flexible hybrid approach to data replication in distributed systems." Intelligent Computing: Proceedings of the 2020 Computing Conference, Volume 1. Springer International Publishing, 2020, 2020
Data replication plays a very important role in distributed computing because a single replica is prone to failure, which is devastating for the availability of the access operations. High availability and low cost for the access operations as well as maintaining data consistency are major challenges for reliable services. Since failures are often inevitable in a distributed paradigm, thereby greatly affecting the availability of services. Data replication mitigates such failures by masking them and makes the system more fault-tolerant. It is the concept by which highly available data access operations can be realized while the cost should be not too high either. There are various state-of-the-art data replication strategies, but there exist infinitely many scenarios which demand designing new data replication strategies. In this regard, this work focuses on this problem and proposes a holistic hybrid approach towards data replication based on voting structures.
Proceedings of the 2009 EDBT/ICDT Workshops on - EDBT/ICDT '09, 2009
In distributed systems, replication is used for ensuring availability and increasing performances. However, the heavy workload of distributed systems such as web2.0 applications or Global Distribution Systems, limits the benefit of replication if its degree (i.e., the number of replicas) is not controlled. Since every replica must perform all updates eventually, there is a point beyond which adding more replicas does not increase the throughput, because every replica is saturated by applying updates. Moreover, if the replication degree exceeds the optimal threshold, the useless replica would generate an overhead due to extra communication messages. In this paper, we propose a suitable replication management solution in order to reduce useless replicas. To this end, we define two mathematical models which approximate the appropriate number of replicas to achieve a given level of performance. Moreover, we demonstrate the feasibility of our replication management model through simulation. The results expose the effectiveness of our models and their accuracy.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2010
Over this six-year journey, many people have crossed my path. New friendships have been born and others have become stronger. So I want you to know that this work would not be possible without your support. Most importantly, I want you to know that these years were crucial for me not only due to the knowledge that I have acquired with your help but also due to the life experience that they have represented. I have no words to thank you for this opportunity. I want to thank everybody from the GSD for the wonderful work environment, in particular, Luí Soares, José Orlando, and António Sousa for the support when needed and the insightful discussions. Thank you, Rui, for this opportunity and for your friendship and advice.
Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems - PODS '97, 1997
The issue of data replication is considered in the context of a restricted system model motivated by certain distributed data-warehousing applications. A new replica management protocol is defined for this model in which gIobaI serializability is ensured, while message overhead and deadlock frequency are less than in previously published work. The advantages of the protocol arise from its use of a lazy approach to update of secondary copies of replicated data and the use of a new concept, virtual sites, to reduce the potential for conflict among global transactions.
Bioscience Biotechnology Research Communications
Replication structures are research areas of all distributed databases. We provide an overview in this paper for comparing the replication strategies for such database systems. The problems considered are data consistency and scalability. These problems preserve continuity with all its replicas spread across multiple nodes between the actual real time event in the external world and the images. A framework for a replicated real time database is discussed and all time constraints are preserved. To broaden the concept of modeling a large database, a general outline is presented which aims to improve the consistency of the data.
International Journal on Information Theory, 2014
In this paper, a review for consistency of data replication protocols has been investigated. A brief deliberation about consistency models in data replication is shown. Also we debate on propagation techniques such as eager and lazy propagation. Differences of replication protocols from consistency view point are studied. Also the advantages and disadvantages of the replication protocols are shown. We advent into essential technical details and positive comparisons, in order to determine their respective contributions as well as restrictions are made. Finally, some literature research strategies in replication and consistency techniques are reviewed.
2007 International Multi-Conference on Computing in the Global Information Technology (ICCGI'07), 2007
Our approach uses replication and guarantees that all processing replicas achieve state consistency, both in the absence of failures and after a failure heals. We achieve consistency in the former case by defining a data-serializing operator that ensures that the order of tuples to a downstream operator is the same at all the replicas. To achieve consistency after a failure heals, we develop approaches based on checkpoint/redo and undo/redo techniques.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.