0% found this document useful (0 votes)
88 views40 pages

MUREX: Mutable Replica Control for P2P Storage

The document summarizes MUREX, a mutable replica control scheme for structured peer-to-peer storage systems. MUREX uses multi-column read/write quorums to maintain data consistency while allowing for synchronous replication under the crash-recovery fault model. It provides three mechanisms - replica pointers, on-demand replica regeneration, and leased locks - to handle replica migration, acquisition and state synchronization challenges in P2P storage systems. Analysis and simulation results show that MUREX achieves one-copy equivalence with low communication cost even under node failures.

Uploaded by

shekar15102627
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
88 views40 pages

MUREX: Mutable Replica Control for P2P Storage

The document summarizes MUREX, a mutable replica control scheme for structured peer-to-peer storage systems. MUREX uses multi-column read/write quorums to maintain data consistency while allowing for synchronous replication under the crash-recovery fault model. It provides three mechanisms - replica pointers, on-demand replica regeneration, and leased locks - to handle replica migration, acquisition and state synchronization challenges in P2P storage systems. Analysis and simulation results show that MUREX achieves one-copy equivalence with low communication cost even under node failures.

Uploaded by

shekar15102627
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 40

MUREX:

A Mutable Replica Control


Scheme for Structured
Peer-to-Peer Storage Systems
Presented by
Jehn-Ruey Jiang
National Central University
Taiwan, R. O. C.
Outline

 P2P Storage Systems


 The Problems
 MUREX
 Analysis and Simulation
 Conclusion

2/40
Outline

 P2P Storage Systems


 The Problems
 MUREX
 Analysis and Simulation
 Conclusion

3/40
P2P Storage Systems
 To aggregate idle storage across the
Internet to be a huge storage space

 Towards Global Storage Systems


 Massive Nodes
 Massive Capacity

4/40
Unstructured vs. Structured
 Unstructured:
 No restriction on the interconnection of nodes
 Easy to build but not scalable

Our Focus!!
 Structured:
 Based on DHTs (Distributed Hash Tables)
 More scalable

5/40
Non-Mutable vs. Mutable
 Non-Mutable (Read-only):
 CFS
 PAST
 Charles
 Mutable: Our Focus!!
 Ivy
 Eliot
 Oasis
 Om
6/40
Replication
 Data objects are replicated for the purpose
of fault-tolerance high data availability
 Some DHTs have provided replication
utilities, which are usually used to replicate
routing states
 The proposed protocol replicates data
objects in the application layer so that it
can be built on top of any DHT
7/40
One-Copy Equivalence
 Data consistency Criterion
 The set of replicas must behave as if
there were only a single copy
 Conditions:
1. no pair of write operations can proceed at the
same time,
2. no pair of a read operation and a write
operation can proceed at the same time,
3. a read operation always returns the replica
that the last write operation writes.
8/40
Synchronous vs. Asynchronous
Our Focus
 Synchronous Replication
 Each write operation should finish updating all
replicas before the next write operation proceeds.
 Strict data consistency
 Long operation latency

 Asynchronous Replication
 A write operation is written to the local replica; data
object is then asynchronously written to other replicas.
 May violate data consistency
 Shorter latency
 Log-based mechanisms to roll back the system
9/40
Fault Models
 Fail-Stop
 Nodes just stop functioning when they fail
 Crash-Recovery
 Failures are detectable
 Nodes can recover and rejoin the system after
state synchronization
 Byzantine
 Nodes may act arbitrary

10/40
Outline

 P2P Storage Systems


 The Problems
 MUREX
 Analysis and Simulation
 Conclusion

11/40
Three Problems
 Replica migration

 Replica acquisition

 State synchronization

12/40
DHT – Node Joining
s Data Object

Hash Function

ks

0 2128-1
Hashed Key SpaceReplica Migration

Peer
Nodes u v
node joining
13/40
DHT – Node Leaving
r Data Object

Hash Function

kr

0 Replica Acquisition 2128-1


Hashed Key Space
rr Data Object

Peer State Synchronization


Nodes p q
node leaving
14/40
Outline

 P2P Storage Systems


 The Problems
 MUREX
 Analysis and Simulation
 Conclusion

15/40
The Solution - MUREX
 A mutable replica control scheme

Keeping one-copy equivalence for


synchronous P2P storage replication under
the crash-recovery fault model

Based on Multi-column read/write quorums

16/40
Operations
 Publish(CON, DON)
 CON: Standing for CONtent
 DON: Standing for Data Object Name

 Read(DON)
 Write(CON, DON)

17/40
Synchronous Replication
 n replicas for each data object
 K1=HASH1(Data Object Name), …,
Kn=HASHn(Data Object Name)

 Using read/write quorums to maintain data


consistency (one-copy equivalence)

18/40
Data Replication Data Object
replica
replican2
replica 1
r

Hash Function 1 Hash Function 2 … Hash Function n

k2 k1 kn
0 2128-1
Hashed Key Space

Peer
Nodes

19/40
Quorum-Based Schemes (1/2)
 High data availability and low communication
cost
 n replicas with version numbers
 Read operation
 Read-lock and access a read quorum
 Obtaining a largest-version-number replica
 Write operation the largest
+1
 Write-lockand access a write quorum
 Updating all replicas with the new version number
20/40
Quorum-Based Schemes (2/2)

 One-copy equivalence is guaranteed If


we restrict
Write-write and write-read lock exclusion
Intersection Property
 A non-empty intersection in any pair of
Aread quorum and a write quorum
Two write quorums

21/40
Multi-Column Quorums
 Smallest quorums: constant-sized
quorums in the best case
 Smaller quorums imply lower communication
cost
 May achieve the highest data availability

22/40
Messages
 LOCK (WLOCK/RLOCK)
 OK
 WAIT
 MISS
 UNLOCK

23/40
Algorithms for Quorum Construction

24/40
Three Mechanisms

 Replica pointer

 On-demand replica regeneration

 Leased lock

25/40
Replica pointer
 A lightweight mechanism to emigrate
replicas
 A five-tuple:
(hashed key, data object name, version
number, lock state, actual storing location)
 It is produced when a replica is first
generated.
 It is moved between nodes instead of the
actual data object,
26/40
On-demand replica regeneration (1/2)

 When node p receives LOCK from node u,


it sends a MISS if it
 does not have the replica pointer
 has the replica pointer which indicates that v
stores the replica, but v is not alive
 After executing the desired read/write
operation, node u will send the newest
replica obtained/generated to node p

27/40
On-demand replica regeneration (2/2)

 Acquiring replicas only when they are


requested
 Dummy read operation
 Performed periodically for rarely-accessed
data object
 To check if replicas of data object are still
alive
 To re-disseminate replicas to proper nodes to
keep data persistency

28/40
Leased lock (1/2)
 A lock expires after a lease period of L
 A node should release all locks if it is not in CS
and H>L-C-D holds.
 H: The holding time of the lock
 D: The propagation delay
 C: time to be in CS

29/40
Leased lock (2/2)
 When releasing all locks, a node
starts over to request locks after a
random backoff time
 If a node starts to substitute another
node at time T, a newly acquired
replica can start to reply to LOCK
requests at time T+L

30/40
Correctness
 Theorem 1. (Safety Property)
 MUREX ensures the one-copy equivalence
consistency criterion
 Theorem 2. (Liveness Property)
 There
is neither deadlock nor starvation in
MUREX

31/40
Outline

 P2P Storage Systems


 The Problems
 MUREX
 Analysis and Simulation
 Conclusion

32/40
Communication Cost
 If no contention
 In the best case: 3s messages
 One LOCK
s: the size of the
 One OK
last column of
 One UNLOCK multi-column
quorums
 When failures occur
 Communication cost increases gradually
 In the worst case: O(n) messages
 A node sends LOCK message to all n replicas
(there are related UNLOCK, OK, WAIT messages)
33/40
Simulation
 Environment
 The underlying DHT is Tornado
 For quorums under four multi-column structures
 MC(5, 3), MC(4, 3), MC(5, 2) and MC(4, 2)
 ForMC(m, s), the leased period is assumed to
be m*(turn-around time)
 2000 nodes in the system
 Simulation for 3000 seconds
 10000 operations are requested
 Half for reading and half for writing
 Each request is assumed to be destined for a
random file (data object) 34/40
Simulation Result 1
The probability that a node succeeds to perform the
desired operation before the leased lock expires
 1st experiment: no node join or leave

Degree of Contention 35/40


Simulation Result 2
 2nd experiment: 200 out of 2000 nodes
may join/leave at will

36/40
Simulation Result 3
 3rd experiment: 0, 50, 100 or 200 out of
2000 nodes may leave

37/40
Outline

 P2P Storage Systems


 The Problems
 MUREX
 Analysis and Simulation
 Conclusion

38/40
Conclusion
 Identify three problems for synchronous
replication in P2P mutable storage systems
 Replica migration
 Replica acquisition
 State synchronization

 Propose MUREX to solve the problems by


 Multi-column read/write quorums
 Replica pointer
 On-demand replica regeneration
 Leased lock 39/40
Thanks!!
40/40

You might also like