Understanding Replication
Understanding Replication
B. Kemme, G. Alonso
Institute of Information Systems Swiss Federal Institute of Technology (ETHZ) ETH Zentrum, CH-8092 Zrich {kemme,alonso}@inf.ethz.ch http://www.inf.ethz.ch/department/IS/iks/
Abstract Replication is an area of interest to both distributed systems and databases. The solutions developed from these two perspectives are conceptually similar but differ in many aspects: model, assumptions, mechanisms, guarantees provided, and implementation. In this paper, we provide an abstract and neutral framework to compare replication techniques from both communities in spite of the many subtle differences. The framework has been designed to emphasize the role played by different mechanisms and to facilitate comparisons. With this, it is possible to get a functional comparison of many ideas that is valuable for both didactic and practical purposes. The paper describes the replication techniques used in both communities, compares them, and points out ways in which they can be integrated to arrive to better, more robust replication protocols.
Keywords Replication protocols, database replication, replication comparison, fault tolerant systems, group communication, distributed systems Technical Areas Distributed Fault-Tolerant Systems and Distributed Algorithms Contact Author Fernando Pedone (pedone@lsemail.ep.ch)
1 Introduction
Replication has been studied in many areas, especially in distributed systems (mainly for fault tolerance purposes) and in databases (mainly for performance reasons). In these two elds, the techniques and mechanisms used are similar, and yet, comparing the protocols developed in the two communities is a frustrating exercise that many researchers have unsuccessfully attempted. Due to the many subtleties involved, mechanisms that are conceptually identical, end up being very different in practice. As a result, it is very difcult to take results from one
area and apply them in the other. In the last few years, as part of the DRAGON project [Dra98], we have devoted our efforts to enhance database replication mechanisms by taking advantage of some of the properties of group communication primitives. We have shown how group communication can be embedded into a database [AAAS97, PGS97, PGS98] and used as part of the transaction manager to guarantee serialisable execution of transactions over replicated data [KA98, KA99]. We have also shown how some of the overheads associated with group communication can be hidden behind the cost of executing transactions, thereby greatly enhancing performance and removing one of the serious limitations of group communication primitives [KPAS99a]. This work has proven the importance of and the need for a common understanding of the replication protocols used by the two communities. In this paper, we present a model that allows to compare and distinguish existing replication protocols in databases and distributed systems. We start by introducing a very abstract replication protocol representing what we consider to be the key phases of any replication strategy. Using this abstract protocol as the base line, we analyse a variety of replication protocols from both databases and distributed systems, and show their similarities and differences. With these ideas, we parameterise the protocols and provide an accurate view of the problems addressed by each one of them. Providing such a classication permits to systematically explore the solution space and give a good baseline for the development of new protocols. While such work is conceptual in nature, we believe it is a valuable contribution since it provides a much needed perspective on replication protocols. However, the contribution is not only a didactic one but also eminently practical. In recent years, and in addition to our work, many researchers have started to explore the combination of database and distributed system solutions [RTKA96, SAA98, PMS99, HAA99]. The results of this paper will help to show which protocols complement each other and how they can be combined. The paper is organised as follows. Section 2 introduces our replication functional model and discusses some basis for our comparison. Section 3 and Section 4 present replication protocols in distributed systems and databases, respectively. In both cases, instead of concentrating on specic protocols, we focus on more general classes of replication algorithms. Section 5 renes the discussion presented in Section 4 for more complex transaction models. Section 6 discusses the different aspects of the paper and gives an conclusion.
safety property says that nothing bad ever happens, while a liveness property says that something good eventually happens.
and how they combine the different phases. In this regard, an abstract replication protocol can be described as a sequence of the following ve phases (see Figure 1). 1. Request (RE): the client submits an operation to one (or more) replicas. 2. Server coordination (SC): the replica servers coordinate with each other to synchronise the execution of the operation. 3. Execution (EX): the operation is executed on the replica servers. 4. Agreement coordination (AC): the replica servers agree on the result of the execution. 5. Response (END): the outcome of the operation is transmitted back to the client.
Phase 3: Execution
Update
Update
This functional model represents the basic steps of replication: submission of an operation, coordination among the replicas (e.g., to order concurrent operations), execution of the operation, further coordination among the replicas (e.g., to guarantee atomicity), and response to the client. The differences between protocols arise due to the different approaches used in each phase which, in some cases, obviate the need for some other phase (e.g., when messages are ordered based on an atomic broadcast primitive, the agreement coordination phase is not necessary since it is already performed as part of the process or ordering the messages). Within this framework, we will rst consider transactions composed of a single operation. This can be a single read or write operation, a more complex operation with multiple parameters, or an invocation on a method. A more advanced transaction model will be considered in Section 5. Although restrictive at rst glance, this model is adopted by some database vendors, to handle web documents and stored procedures. Request Phase. During the request phase, a client submits an operation to the system. This can be done in two ways: the client can directly send the operation to all replicas or the client can send the operation to one replica which will them send the operation to all others as part of the server coordination phase.
This distinction, although apparently simple, already introduces some signicant differences between databases and distributed systems. In databases, clients never contact all replicas, and always send the operation to one copy. The reason is very simple: replication should be transparent to the client. Being able to send an operation to all replicas will imply the client has knowledge about the data location, schema, and distribution which is not practical for any database of average size. This is knowledge intrinsically tied to the database nodes, thus, client must always submit the operation to one node which will then send it to all others. In distributed systems, however, a clear distinction is made between replication techniques depending on whether the client sends the operation directly to all copies (e.g. active replication) or to one copy (e.g. passive replication). It could be argued that in both cases, the request mechanisms can be seen as contacting a proxy (a database node in one case, or a communication module in the other), in which case there are no signicant differences between the two approaches. Conceptually this is true. Practically, it is not a very helpful abstraction because of its implications as it will be discussed below when the different protocols are compared. For the moment being, note that distributed systems deal with processes while database deal with relational schemas. A list of processes is simpler to handle that a database schema, i.e., a communication module can be expected to be able to handle a list of processes but it is not realistic to assume it can handle a database schema. In particular, database replication requires to understand the operation that is going to be performed while in distributed systems, operation semantics usually play no role. Finally, distributed systems distinguish between deterministic and non-deterministic replica behaviour. Deterministic replica behaviour assumes that when presented with the same operations in the same order, replicas will produce the same results. Such an assumption is very difcult to make in a database. Thus, if the different replicas have to communicate anyway in order to agree on a result, they can as well exchange the actual operation. By shifting the burden of broadcast the request to the server, the logic necessary at the client side is greatly simplied at the price of (theoretically) reducing fault tolerance. If fault tolerance is necessary, a back up system can be used, but this is totally transparent to the client. Server Coordination Phase. During the server coordination phase, the different replicas try to nd an order in
which the operations need to be performed. This is the point where protocols differ the most in terms of ordering strategies, ordering mechanisms, and correctness criteria. In terms of ordering strategies, databases order operations according to data dependencies. That is, all operations must have the same data dependencies at all replicas. It is because of this reason that operation semantics play an important role in database replication: an operation that only reads a data item is not the same as an operation that modies that data item since the data dependencies introduced are not the same in the two cases. If there are no direct or indirect dependencies between two operations, they do not need to be ordered because the order does not matter. Distributed systems, on the other hand, are commonly based on very strict notions of ordering. From causality, which is based on potential dependencies without looking at the operation semantics, to total order (either causal or not) in which all operations are ordered regardless of what they are.
In terms of correctness, database protocols use serializability adapted to replicated scenarios: one-copy serializability [BHG87]. It is possible to use other correctness criteria[KA98] but, in all cases, the basis for correctness are data dependencies. Distributed systems use linearisability and sequential consistency [AW94]. Linearisability is strictly stronger than sequential consistency. Linearisability is based on real-time dependencies, while sequential consistency only considers the order in which operations are performed on every individual process. Sequential consistency allows, under some conditions, to read old values. In this respect, sequential consistency has similarities with one-copy serializability, but strictly speaking, the two consistency criteria are different. The distributed system replication techniques presented in this paper all ensure linearisability. Execution Phase. The execution phase represents the actual performing of the operation. It does not introduce many differences between protocols, but it is a good indicator of how each approach treats and distributes the operations. This phase only represents the actual execution of the operation, the applying of the update is typically done in the Agreement Coordination Phase, even though applying the update to other copies may be done by re-executing the operations. Agreement Coordination Phase. During this phase, the different replicas make sure that they all do the same thing. This phase is interesting because it brings up some of the fundamental differences between protocols. In databases, this phase usually corresponds to a Two Phase Commit Protocol (2PC) during which it is decided whether the operation will be committed or aborted. This phase is necessary because in databases, the Server Coordination phase takes care only of ordering operations. Once the ordering has been agreed upon, the replicas need to ensure everybody agrees to actually committing the operation. Note that being able to order the operations does not necessarily mean the operation will succeed. In a database, there can be many reasons why an operation succeeds at one site and not at another (load, consistency constraints, interactions with local operations). This is a fundamental difference with distributed systems where once an operation has been successfully ordered (in the Server Coordinator phase) it will be delivered (i.e., performed) and there is no need to do any further checking. Client Response Phase. The client response phase represents the moment in time when the client receives a response from the system. There are two possibilities: either the response is sent only after everything has been settled and the operation has been executed, or the response is sent right away and the propagation of changes and coordination among all replicas is done afterwards. In the case of databases, this distinction leads to 1) the so called eager or synchronous (no response until everything has been done) and 2) lazy or asynchronous (immediate response, propagation of changes is done afterwards) protocols. In the distributed systems case, the response takes place only after the protocol has been executed and no discrepancies may arise. The client response phase is of increasing importance given the proliferation of applications for mobile users, where a copy is not always connected to the rest of the system and it does not make sense to wait until updates take place to let the user see the changes made.
messages that are ABCAST to the same group g of servers. The atomicity property ensures that if one member of g delivers m (respt. m ), then all (not crashed) members of g eventually deliver m (respt. m ). The order property ensures that if two members of g deliver both m and m , they deliver them in the same order. View Synchronous Broadcast (VSCAST). The denition of View Synchronous Broadcast is more complex. It is dened in the context of a group g , and is based on the notion of a sequence of views v0 (g ), v1 (g ), . . . , vi (g ), . . . of group g . Each view vi (g ) denes the composition of the group at same time t, i.e. the members of the group that are perceived as being correct at time t. Whenever a process p in some view vi (g ) is suspected to have crashed, or some process q wants to join, a new view vi+1 (g ) is installed, which reects the membership change. Roughly speaking, VSCAST of message m by some member of the group g currently in view vi (g ) ensures
the following property: if one process p in vi (g ) delivers m before installing view vi+1 (g ), than no process installs view vi+1 (g ) before having rst delivered m.
Phase 3: Execution
The following steps are involved in the processing of an update request in the Active Replication, according to our functional model.
1. The client sends the request to the servers using an Atomic Broadcast. 2. Server coordination is given by the total order property of the Atomic Broadcast. 3. All replicas execute the request in the order they are delivered. 4. No coordination is necessary, as all replica process the same request in the same order. Because replica are deterministic, they all produce the same results. 5. All replica send back their result to the client, and the client typically only waits for the rst answer (the others are ignored).
Phase 3: Execution
Communication between the primary and the backups has to guarantee that updates are processed in the same order, which is the case if primary backup communication is based on FIFO channels. However, only FIFO channels is not enough to ensure correct execution in case of failure of the primary. For example, consider that the primary fails before all backups receive the updates for a certain request, and another replica takes over as a new primary. Some mechanism has to ensure that updates sent by the new primary will be properly ordered with regard to the updates sent by the faulty primary. VSCAST is a mechanism that guarantees these constraints, and can usually be used to implement the primary backup replication technique [GS97].
Passive replication can tolerate non-deterministic servers (e.g., multi-threaded servers) and uses little processing power when compared to other replication techniques. However, passive replication suffers from a high reconguration cost when the primary fails. The ve steps of our framework are the following: 1. The client sends the request to the primary. 2. There is no initial coordination. 3. The primary executes the request. 4. The primary coordinates with the other replicas by sending the update information to the backups. 5. The primary sends the answer to the client.
Phase 3: Execution
Non deterministic point Update Update Update
Apply Apply
The following steps characterise semi-active replication, according to our framework. 1. The client sends the request to the servers using an Atomic Broadcast. 2. The servers coordinate using the order given by this Atomic Broadcast. 10
3. All replicas execute the request in the order they are delivered. 4. In case of a non deterministic choice, the leader informs the followers using the View Synchronous Broadcast. 5. The servers sends back the response to the client.
3.6 Summary
Figure 5 summarises the different replication approaches in distributed systems, grouped according the following two dimensions: (1) failure transparency for clients, and (2) server determinism.
Server Determinism Server Determinism Not Needed Needed Server Failure Not Transparent for the Client Server Failure Transparent for the Client
Active
4 Database Replication
Replication in database systems is done mainly for performance reasons. The objective is to access data locally in order to improve response times and eliminate the overhead of having to communicate with other sites. If consistency needs to be guaranteed, this is only possible for read operations. Otherwise, both read and writes can
11
be done locally, leaving consistency in the hands of the replication protocol. Fault tolerance is an issue but it is solved using back up mechanisms which, even being a form of replication, are entirely transparent to the clients.
12
of the changes takes place. The rst approach provides consistency in a straightforward way but it is expensive in terms of message overhead and response time. Lazy replication allows a wide variety of optimisations, however, since copies are allowed to diverge, inconsistencies might occur.
update propagation update location Eager Primary Copy Eager Update Everywhere Lazy Primary Copy Lazy Update Everywhere
In regard to who is allowed to perform updates, the primary copy approach requires all updates to be performed rst at one copy (the primary or master copy) and then at the other copies. This simplies replica control at the price of introducing a single point of failure and a potential bottleneck. The update everywhere approach allows any copy to be updated, thereby speeding up access but at the price of making coordination more complex.
13
Phase 3: Execution
From here, it is easy to see that eager primary copy replication is functionally equivalent to passive replication with VSCAST. The only differences are internal to the Agreement Coordination phase (2PC in the case of databases and VSCAST in the case of distributed systems). This difference can be explained by the use of transactions in databases. As explained, VSCAST is used to guaranteed that operations are ordered correctly even after a failure occur. In a database environment, the use of 2PC guarantees that if the primary fails, all active transactions will be aborted. Therefore, there is no need to order operations from before the failure and after the failure since there is only one source and the different views cannot overlap with each other.
Phase 3: Execution
Locking [BHG87] while in distributed systems this is achieved using ABCAST. The 2 Phase Commit mechanism used in the Agreement Coordination phase of the database replication protocol corresponds to the use of a VSCAST mechanism in the distributed systems protocol. Note that, if it could be assumed that databases are deterministic, the 2PC mechanism would not be needed and the protocol would be functionally identical to active replication (as shown below). 4.4.2 Data Replication based on Atomic Broadcast The idea of using group communication primitives to implement database replication has been around for quite some time. However, it has not been until recently that the problem has been tackled with sufcient depth so as to provide realistic solutions [SR96, KA98, KPAS99a]. The basic idea behind this approach is to use the total order guaranteed by ABCAST to provide a hint to the transaction manager on how to order conicting operations. Thus, the client submits its request to one database server which then broadcasts the request to all other database servers (note that in distributed systems, the client broadcasts the request directly to all servers). Instead of 2 Phase Locking, the server coordination is done based on the total order guaranteed by ABCAST and using some techniques to obtain the locks in a consistent manner at all sites [KA98, KPAS99b]. Then the operation is executed and a response sent back to the client. the following steps are involved in this approach (see Figure 9). 1. the client sends the request to the local server 2. the server forwards the request to all servers which coordinate using the order given by the Atomic broadcast. 3. the servers execute the transaction. If two operations conict they are executed in the order of the atomic broadcast. 4. there is no coordination at this point. 5. the local servers sends back the response to the client.
15
Phase 3: Execution
The similarities between active replication and eager update everywhere using ABCAST are obvious when Figures 2 and 9 are compared. The only signicant difference is the interaction between the client and the system, as pointed out above. Regarding the determinism of the databases, a complete study of the requirements and the conditions under which ABCAST can be used for database replication and when an Agreement Coordination is necessary can be found in [KA98].
16
Phase 3: Execution
stale but inconsistent. Reconciliation is needed to decide which updates are the winners and which transactions must be undone. There are some reconciliation schemes around, however, most of them are on a per object basis. This is enough in the case where one transaction consists of one operation, however, they are not sufcient when transactions consists of more operations on different objects.
A straightforward solution in the case of our simple model is to run an Atomic Broadcast and determine the after-commit-order according to the order of the atomic broadcast. Note that the concept of laziness, while existing in distributed systems approaches [RL92], is not widely used. This reects the fact that those solutions are mainly developed for fault-tolerant purposes, making an eager approach obligatory. Lazy approaches, on the other hand, are a straightforward solution if performance is the main issue. Response times have to be short not allowing any communication within a transaction.
17
Reconciliation
5 Transactions
In many databases, transactions are not one single operation or are not executed via a stored procedure. Instead, transactions are a partial order of read and write operations which are not necessarily available for processing at the same time. This has important consequences for replication protocols. In particular, the protocol has to deal more explicitly with the execution of the transaction, taking into account multiple operations. These new set of protocols have no equivalent in distributed systems since the notion of transaction is not that common in this community.
18
Update
apply apply
{ {
Operation 1 Operation 2
Figure 12: Eager primary copy approach for transactions
Commit
19
change propagation
change propagation
{
Operation 1
5.4.2 Certication Based Database Replication the total order established by ABCAST (Figure 14).
{
Operation 2
20
Commit
When using ABCAST to send the operations to all replicas, the resulting total order has no bearing on the serialisation order that needs to be produced. For this reason, it does not make much sense to use ABCAST to send every operation of a transaction separately. It makes sense, however, to use shadow copies at one site to perform the operations and then, once the transaction is completed, send all the changes in one single message [KA98]. Due to the fact that now a transaction manager has to unbundle these messages, the agreement coordination phase gets to be more complicated since it involves deciding whether the operations can be executed correctly. This can be seen as a certication step during which sites make sure they can execute transactions in the order specied by
6 Discussion
This paper presents a general comparison of replication approaches used in the distributed system and database communities. Our approach was to rst characterise replication algorithms using a generic framework. Our generic framework identies ve basics steps, and, although simple, allowed us to classify classical replication protocols described in the literature on distributed systems and databases. Figure 15 summarises the combinations of the replication techniques presented in the paper that guarantee strong consistency (i.e., linearisability and one-copy serialisability). From Figure 15 we see that any replication technique that ensures strong consistency has either an SC and/or AC step before the END step.
Phase 3: Execution
All techniques have at least one synchronisation step (SC or AC). If the execution step (EX) is deterministic, no synchronisation after EX is needed, as the execution will yield the same result on all servers. For the same reason, if only one server does the execution step, there is no need for synchronisation before the execution.
RE RE RE
SC
EX EX
AC AC
SC
EX
Figure 16 summarises the different replication techniques in the case of single operation transactions. Several conclusion can be drawn from this gure. First, primary copy and passive replication schemes share one common trait: they do not have an SC phase (since the primary does the processing, there is no need for early synchronisation between replicas). Furthermore, update everywhere replication schemes need the initial SC phase before an update can be executed by the replicas. The only exception are the Certication based techniques that use Atomic Broadcast (Sect. 5.4.2). Those techniques are optimistic in the sense that they do the processing without initial synchronisation, and abort transactions in order to maintain consistency. Finally, the difference between eager and lazy replication techniques is the ordering of the AC and END phases: in the eager technique, the AC phase comes rst, while in the lazy technique, the END phase comes rst. Despite different models, constraints and terminologies, replication algorithms for distributed systems and databases bear several similarities. These similarities put into evidence the need for stronger cooperation between both communities. For example, replicated databases could benet from the abstractions of distributed systems. 21
Model
Active Passive Semi-Active
RE RE RE RE RE RE RE RE RE RE
SC SC
EX EX EX
AC
END END
AC AC AC AC
SC
EX EX
Eager Primary Copy Eager Update Everywhere with Distributed Locking Eager Update Everywhere with ABCAST Certification based replication Lazy Primary Copy Lazy Update Everywhere
SC SC
EX EX EX EX EX
AC END END
END AC AC
}
}
Presently, we are planning a performance study of the different approaches, taking into account different workloads and failures assumptions.
References
[AAAS97] D. Agrawal, G. Alonso, A. El Abbadi, and I. Stanoi. Exploiting atomic broadcast in replicated databases. In Proceedings of EuroPar (EuroPar97), Passau (Germany), 1997. [AD76] P.A. Alsberg and J.D. Day. A principle for resilient sharing of distributed resources. In Proceedings of the International Conference on Software Engineering, October 1976.
[AKA+ 96] G. Alonso, M. Kamath, D. Agrawal, A. El Abbadi, R. Gnthr, and C. Mohan. Advanced transaction models in the workow contexts. In Proceedings of the International Conference on Data Engineering, New Orleans, February 1996. [AS87] [AW94] [BHG87] [Bir93] [BJ87] B. Alpern and F.B. Schneider. Recognizing safety and liveness. Distributed Computing, 2:117126, 1987. H. Attiya and J. Welch. Sequential consistency versus linearizability. ACM Transactions on Computer Systems, 12(2):91122, May 1994. P. Bernstein, V. Hadzilacos, and N. Goodman. Concurrency Control and Recovery in Database Systems. Addison-Wesley, 1987. K. P. Birman. The process group approach to reliable distributed computing. Communications of the ACM, 36(12):3753, December 1993. K. P. Birman and T. A. Joseph. Exploiting virtual synchrony in distributed systems. In Proceedings of the 11th ACM Symposium on OS Principles, pages 123138, Austin, TX, USA, November 1987. ACM SIGOPS, ACM.
22
[BSS91] [Dra98]
K. P. Birman, A. Schiper, and P. Stephenson. Lightweight causal and atomic group multicast. ACM Transactions on Computer Systems, 9(3):272314, August 1991. Information & Communcations Systems Research Group, ETH Zrich and Laboratoire de Systmes dExploitation (LSE), EPF Lausanne. DRAGON: Database Replication Based on Group Communication, May 1998. http://www.inf.ethz.ch/department/IS/iks/research/dragon.html. X. Dfago, A. Schiper, and N. Sergent. Semi-passive replication. In Proceedings of the 17th IEEE Symposium on Reliable Distributed Systems (SRDS), pages 4350, West Lafayette, IN, USA, October 1998.
[DSS98]
[GHPO96] J. N. Gray, P. Helland, and and D. Shasha P. ONeil. The dangers of replication and a solution. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, pages 17382, Montreal, Canada, June 1996. SIGMOD. Microsoft Technical Report MSR-TR-96-17. [GR93] [GS97] [HAA99] [HT93] [KA98] [KA99] J. Gray and A. Reuter. Transaction Processing: concepts and techniques. Data Management Systems. Morgan Kaufmann Publishers, Inc., San Mateo (CA), USA, 1993. R. Guerraoui and A. Schiper. Software-based replication for fault tolerance. IEEE Computer, 30(4):6874, April 1997. J. Holliday, D. Agrawal, and A. El Abbadi. The performance of database replication with group multicast. In Proceedings of IEEE International Symposium on Fault Tolerant Computing (FTCS29), pages 158165, 1999. V. Hadzilacos and S. Toueg. Fault-tolerant broadcasts and related problems. In Sape Mullender, editor, Distributed Systems, chapter 5. adwe, second edition, 1993. B. Kemme and G. Alonso. A suite of database replication protocols based on group communication primitives. In Proceedings of the 18th Internationnal Conference on Distributed Computing Systems (ICDCS), Amsterdam, The Netherlands, May 1998. B. Kemme and G. Alonso. Transactions, messages and events: Merging group communication and database system. In 3rd Europeean Research Seminar on Advances in Distributed Systems (ERSADS99), Madeira Island (Portugal), April 2328, 1999. BROADCAST Esprit WG 22455.
[KPAS99a] B. Kemme, F. Pedone, G. Alonso, and A. Schiper. Processing transactions over optimistic atomic broadcast protocols. In Proceedings of the International Conference on Distributed Computing Systems, Austin, Texas, 1999. to appear. [KPAS99b] B. Kemme, F. Pedone, G. Alonso, and A. Schiper. Using optimistic atomic broadcast in transaction processing systems. Technical report, Department of Computer Science, ETH Zrich, March 1999. [PCD91] [PGS97] [PGS98] [PMS99] [RL92] D. Powell, M. Chrque, and D. Drackley. Fault-tolerance in Delta-4*. ACM Operating Systems Review, SIGOPS, 25(2):122125, April 1991. F. Pedone, R. Guerraoui, and A. Schiper. Transaction reordering in replicated databases. In Proceedings of the 16th Symposium on Reliable Distributed Systems (SRDS-16), Durham, North Carolina, USA, October 1997. F. Pedone, R. Guerraoui, and A. Schiper. Exploiting atomic broadcast in replicated databases. In Proceedings of EuroPar (EuroPar98), September 1998. E. Pacitti, P. Minet, and E. Simon. Fast algorithms for maintaining replica consistency in lazy master replicated databases. In Proceedings of the 25th International Conference on Very Large Databases, Edinburgh - Scotland - UK, 710 September 1999. S. Ghemawat R. Ladin, B. Liskov. Providing high availability using lazy replication. ACM Transactions on Computer Systems, 10(4):360391, November 1992.
[RTKA96] M. Raynal, G. Thia-Kime, and M. Ahamad. From serializable to causal transactions for collaborative applications. Technical Report 983, Institut de Recherche en Informatique et Systmes Alatoires, February 1996. [SAA98] [Sch90] [SR96] [SS93] I. Stanoi, D. Agrawal, and A. El Abbadi. Using broadcast primitives in replicated databases. In Proceedings of the International Conference on Distributed Computing Systems ICDCS98, pages 148155, Amsterdam, The Netherlands, May 1998. F. B. Schneider. Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Computing Surveys, 22(4):299319, December 1990. A. Schiper and M. Raynal. From group communication to transactions in distributed systems. Communications of the ACM, 39(4):8487, April 1996. A. Schiper and A. Sandoz. Uniform reliable multicast in a virtually synchronous environment. In Proceedings of the 13th International Conference on Distributed Computing Systems (ICDCS-13), pages 561568, Pittsburgh, Pennsylvania, USA, May 1993. IEEE Computer Society Press. M. Stonebraker. Concurrency control and consistency of multiple copies of data in distributed Software Engineering, SE-5:188194, May 1979.
INGRES .
[Sto79]
IEEE Transactions on
23