0% found this document useful (0 votes)
15 views120 pages

Lect5 - Distributed Shared Memory

Distributed shared memory (DSM) provides processes with a shared address space across distributed memory systems. DSM uses page-level data caching and migration between nodes to provide shared memory access without physically shared memory. Key aspects of DSM design include granularity, memory coherence and access synchronization, and consistency models.

Uploaded by

xiceca3316
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
15 views120 pages

Lect5 - Distributed Shared Memory

Distributed shared memory (DSM) provides processes with a shared address space across distributed memory systems. DSM uses page-level data caching and migration between nodes to provide shared memory access without physically shared memory. Key aspects of DSM design include granularity, memory coherence and access synchronization, and consistency models.

Uploaded by

xiceca3316
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 120

Distributed Shared

Memory
Distributed Operating
Systems
Distributed Shared Memory
( DSM )

1
Distributed shared memory

 DSM paradigm provides process with shared address


space.
 Primitives for shared memory:
1. Read(address)
2. Write(address, data)
 Shared memory paradigm gives the system a illusion of
physically shared memory.
 DSM refers to shared memory paradigm applied to
loosely coupled distributed memory systems.

2
Cont….
 Shared memory exists only virtually.
 Similar concept to virtual memory.
 DSM also known as DSVM.
 DSM provides a virtual address space shared among
processes on loosely coupled processors.
 DSM is basically an abstraction that integrates the local
memory of different machine into a single logical entity
shared by cooperating processes.

3
Distributed shared memory

4
DSM Architecture
 Each node of the system consist of one or more CPUs
and memory unit.
 Nodes are connected by high speed communication
network.
 Simple message passing system for nodes to exchange
information.
 Main memory of individual nodes is used to cache pieces
of shared memory space.
 Reduces network latency

5
Cont….
 Memory mapping manager routine maps local memory
to shared virtual memory.
 Shared memory space is partitioned into blocks.
 Data caching is used in DSM system to reduce network
latency.
 The basic unit of caching is a memory block.

6
Cont….
 If data is not available in local memory network block
fault is generated.

 On Network block fault:


 The missing block is migrated from the remote node to the
client process’s node and OS maps it into the application’s
address space.
 Data blocks keep migrating from one node to another on
demand but no communication is visible to the user processes.

7
Design and implementation issues
1. Granularity
2. Structure of Shared memory
3. Memory coherence and access synchronization
4. Data location and access
5. Replacement strategy
6. Thrashing
7. Heterogeneity

8
Granularity

 Refers to the block size of DSM.


 The unit of sharing & the unit of data transfer across the
network when a network block fault occurs.

 Possible unit are a few word, a page or few pages.

9
Structure of Shared memory

 Structure refers to the layout of the shared data in


memory.

 Dependent on the type of applications that the DSM


system is intended to support.

10
Memory Coherence & Access Synchronization
 Replicated and shared data items may
simultaneously be available in the main memories of
a number of nodes.

 Memory coherence problem


 Deals with the consistency of a piece of shared data lying
in the main memories of two or more nodes.

11
Data location and access

 To share data in a DSM, it should be possible to locate


and retrieve the data accessed by a user process.

12
Replacement strategy
 Ifthe local memory of a node is full, a cache miss at that
node implies not only a fetch of accessed data block from
a remote node but also a replacement.

 Data block must be replaced by the new data block.

13
Thrashing

 Data block migrate between nodes on demand.

 Therefore if two nodes compete for write access to a


single data item, the corresponding data block may be
transferred back.

14
Heterogeneity

 The DSM system built for homogeneous system need not


address the heterogeneity issue.

 However, if the underlying system environment is


heterogeneous, the DSM system must be designed to take
care of heterogeneity so that it functions properly with
machines having different architectures.

15
Granularity

 Most visible parameter in the design of DSM system is


block size.

 Sending small packet of data is more expensive than


sending large packet.

16
Cont…
 Factors influencing block size selection:
1. Paging overhead
2. Directory size
3. Thrashing
4. False sharing

17
Paging overhead

 A process is likely to access a large region of its


shared address space in a small amount of time.

 The paging overhead is less for large block size as


compared to the paging overhead for small block size.

18
Directory size

 The larger the block size, the smaller the


directory.

 Result: reduced directory management overhead


for larger block size.

19
Thrashing

 The problem of thrashing may occur:

 when data item in the same data block are being


updated by multiple nodes at the same time.

 with any block size, more likely with larger block


size.

20
False sharing

 Occurs when two different processes access


two unrelated variable that reside in the
same data block.

21
Cont…

 The larger is the block size, higher is the probability of


false sharing.

 False sharing of a block may lead to a thrashing problem.

22
Cont…
Using page size as block size

 Relative advantage and disadvantages of small


and large block size makes it difficult for DSM
designer to decide on a proper block size.

 On Intel Core2 Duo 64-bit system, the page size


is 4kB, which is normal for almost every desktop
PC architectures.

 Linux kernel 2.6.x versions support large pages


(4MB).
23
Advantages of using page size as
block size

 Use of existing page fault schemes to trigger a DSM page


fault.
 Allows the access right control.
 Page size do not impose undue communication overhead
at the time of network page fault.
 Page size is a suitable data entity unit with respect to
memory contention.

24
Structure of Shared-Memory Space

 Structure defines the abstract view of the shared


memory space for application programmers.

 The structure and granularity of a DSM system are


closely related.

25
Cont…
Three approaches:
1. No structuring
2. Structuring by data type
3. Structuring as a database

26
No structuring

 The shared memory space is simply a linear


array of words.

 Advantages:
 Convenient to choose any suitable page size as the
unit of sharing.
 A fixed grain size may be used for all application.
 Simple and easy to design such a DSM system.

27
Structuring by data type
 The shared memory space is structured as a collection of
objects in the source language.

 The granularity in such DSM system is an object.

 DSM system uses variable grain size to match the size of


the object/variable being accessed by the application.
 Complicates the design and implementation.

28
Structuring as a database
 Structures the shared memory like a database.

 Shared memory space is ordered as an associative memory


called tuple space.

 To perform update, old data item in the DSM are replaced by


new data item.

 Processes select tuples by specifying the number of their


fields and their values or type.

 Access to shared data is non-transparent.

29
Consistency Models
 Consistency requirement vary from application to
application.

 Refers to the degree of consistency that has to be


maintained for the shared memory data.

 If a system supports a stronger consistency model,


weaker consistency model is automatically supported
but the converse is not true.

30
Types of Consistency Models
1. Strict Consistency model
2. Sequential Consistency model
3. Casual consistency model
4. Pipelined Random Access Memory consistency
model (PRAM)
5. Processor Consistency model
6. Weak consistency model
7. Release consistency model

31
Strict Consistency Model
 The strongest form with most stringent consistency
requirement.

 Value returned by a read operation on a memory


address is always the same as the value written by
the most recent write operation to that address.

 All writes instantaneously become visible to all


processes.

 Implementation requires the existence of an absolute


global time to synchronize clocks of all nodes.
 Practically impossible.

32
Sequential Consistency Model
 Proposed by Lamport in 1979.

 All processes see the same order of all memory access


operations on the shared memory.

 Exact order of access operations are interleaved & does


not matter.

33
Cont…
Example :
 Operations performed in order
1. Read(r1)
2. write(w1)
3. Read(r2)

34
Cont…

 Only acceptable ordering


 for a strictly consistency memory
(r1, w1, r2)

 For sequential consistency model


Any of the orderings (r1,w1,r2), (r1,r2,w1), (w1,r1,r2),
(w1,r2,r1), (r2,r1,w1), (r2,w1,r1), is correct

if all processes see the same ordering.

35
Cont…
 The consistency requirement of the sequential
consistency model is weaker than that of the strict
consistency model.

 A sequentially consistent memory provide one-copy


/single-copy semantics.

 Acceptable by most applications.

36
Casual Consistency Model (PCR)
 Proposed by hutto and ahemad in 1990.

 All processes see only those memory reference


operations in the correct order that are potentially
casually related (PCR).
 Otherwise seen by different processes in different order.

 Required to construct and maintain dependency graphs


for memory access operations.

37
Pipelined Random Access Memory
(PRAM) Consistency model
 Proposed by Lipton and Sandberg in 1988.
 A weaker consistency semantics.
 Ensures that all write operations performed by a single
process are seen by all other processes in the order in
which they were performed as if all write operations
performed by a single process are in a pipeline.

38
PRAM
 P1 – W11 & W12
 P2 – W21 & W22
 P3 can see these as ((W11,W12), (W21,W22))
 P4 can see these as ((W21,W22), (W11,W12))

39
Cont…

 Advantages:
 Simple and easy to implement.
 Good performance

 Limitation:
 All processes may not agree on the same order of memory
reference operations.

40
Processor Consistency model
 Proposed by Goodman in 1989.

 Very similar to PRAM model with additional restriction of


memory coherence.

 Memory coherence:
 For any memory location, all processes agree on the same order
of all write operation to that location.

 Processor consistency ensures that all write operations


performed on the same location are seen by all
processes in the same order.
41
Weak consistency model
 Proposed by Dubois in 1988.

 Many applications may demand that:


 It is not necessary to show the change in memory
done by every write operation to other processes.

 Isolated accesses to shared variable are rare.

42
Cont…
 Idea of weak consistency:
 Better performance can be achieved if consistency
is enforced on a group of memory reference
operations rather than on individual memory
reference operations.

 Uses a special variable called a


synchronization variable to synchronize
memory.

43
Cont…
Requirements:
1. All accesses to synchronization variables must obey
sequential consistency semantics.
2. All previous write operations must be completed
everywhere before an access to a synchronization
variable is allowed.
3. All previous accesses to synchronization variables must
be completed before access to a non-synchronization
variable is allowed.
4. Better performance at the cost of putting extra burden on the
programmers.
44
Release consistency model

 Enhancement of weak consistency model.

 Use of two synchronization variables:


 Acquire
 Release

45
Cont…

 Acquire
 Used to tell the system that a process is
entering Critical section.

 Results in propagating changes made by other


nodes to process's node.

46
Cont…

 Release
 Used to tell the system that a process has just
exited critical section.

 Results in propagating changes made by the


process to other nodes.

47
Cont…
 A variation of release consistency is lazy
release consistency proposed by Keleher in
1992.

 Modifications are not sent to other nodes at the


time of release but only on demand.

 Better performance.

48
Implementing sequential consistency
model
 Protocol for implementing depends on
whether the DSM system allows:
 replication & / or
 Migration
Of shared memory data blocks.

49
Cont…
Strategies:
1. Nonreplicated, Nonmigrating blocks (NRNMB)
2. Nonreplicated, Migrating blocks (NRMB)
3. Replicated, migrating blocks (RMB)
4. Replicated, Nonmigrating blocks (RNMB)

50
NRNMBS
 Simplest strategy.

 Each block of the shared memory has a single


copy whose location is always fixed.

51
Cont…

52
Cont…
 Enforcing sequential consistency is trivial.

 Method is simple and easy to implement.

 Drawbacks:
 Serializing data access creates a bottleneck.

 Parallelism is not possible.

53
Cont…
Data locating in the NRNMB strategy:
 There is a single copy of each block in the entire
system.

 The location of a block never changes.

 Requires a simple mapping function to map a block


of a node.

54
NRMBS
 Each block of the shared memory has a single
copy in the entire system.

 Migration is allowed.
 Owner node of a block changes as soon as the
block is migrating to a new node.

 Only the processes executing on one node


can read or write a given data item at any
time.
55
Cont…

56
Cont…
 Advantage :
 No communications cost are incurred when a
process accesses data currently held locally.
 Advantage of data access locality.

 Drawbacks:
 Prone to thrashing problem.
 No parallelism.

57
Data locating in the NRMB strategy

 There is a single copy of each block, the location


of a block keeps changing dynamically.

58
Cont…
Following method used:
1. Broadcasting
2. Centralized server algorithm
3. Fixed distributed server algorithm
4. Dynamic distributed server algorithm

59
Broadcasting

 Each node maintains an owned block table


that contains an entry for each block for
which the node is the current owner.

60
Cont…

61
Cont….

 On a fault,
 Fault handler of the faulting node broadcasts a
read/write request on the network.

Disadvantage:
 Not scalable.

62
Centralized server algorithm

63
Cont…
 A centralized server maintains a block table
that contains the location information for all
block in the shared memory space.

Drawbacks:
 A centralized server serializes location
queries, reducing parallelism.

 The failure of the centralized server will cause


the DSM system to stop functioning.
64
Fixed distributed server algorithm

65
Cont..
 A direct extension of the centralized server
scheme.

 There is a block manager on several nodes.

 Each block manager is given a predetermined


subset of data blocks to manage.

 On fault, the mapping functions is used by


the currently accessed block.
66
Dynamic distributed server algorithm

67
Cont…
 Does not use any block manager and
attempts to keep track of the ownership
information of all block in each node.

 Each node has a block table.

 Contains the ownership information for all blocks.

 Use of probable owner.

 When fault occurs, the faulting node extracts 68


from its block table the node information.
RMB - Replicated, migrating blocks
 Used to increase parallelism.

 Read operations are carried out in parallel at multiple


nodes by accessing the local copy of the data.

69
Cont…
 Replication tends to increase the cost of write
operation.
 Problem to maintain consistency.

 Extra expense if the read/write ratio is large.

70
Cont…
 Protocols to ensure sequential consistency:
1. Write-invalidate
2. Write-update

71
Write-invalidate
 All copies of a piece of data except one are invalidated
before a write can be performed on it.

 After invalidation of a block, only the node that performs


the write operation on the block holds the modified
version of the block.

72
Cont…

73
Write-update
 A write operation is carried out by updating all copies of
the data on which the write is performed.

 On write fault, the fault handler copies the accessed


block from one of the block’s current node to its own
node.

 The write operation completes only after all the copies of


the block have been successfully updated.

74
Cont…

75
Cont…

 Assign sequence number to the modification and


multicasts that to all the nodes where a replica of the
data block to be modified is located.

 The write operations are processed at each node in


sequence number order.

 If the verification fails, node request the sequencer


for a retransmission of the missing modification.

76
Cont…

77
Cont…
 Write-update approach is very expensive.

 Inthe write-invalidate approach, updates are only


propagated when data is read and several updates
can take place before communication is necessary.
 Use of status tag associated with each block.
 Indicates the block is valid/shared/ read-only / writable.

78
Cont…

Read request:
 If there is a local block containing the data and if it
is valid, the request is satisfied by accessing the
local copy of data.

 Otherwise, the fault handler of the requesting node


generates a read fault.

79
Cont…

Write request:
 If there is a local block containing the data and if it
is valid and writable, the request is immediately
satisfied by accessing the local copy of the data.

 Otherwise, the fault handler of the requesting node


generates a write fault and obtain a valid copy of
the block and changes its status to writable.

80
Data Locating in the RMB strategy

Write-invalidate protocol

1. Locating the owner of a block.


2. Keeping track of the node that currently has a
valid copy of the block.

81
Cont…
Following algorithms may be used:
1. Broadcasting
2. Centralized-server algorithm
3. Fixed distributed-server algorithm
4. Dynamic distributed-server algorithm

82
Cont… : Broadcasting

83
Cont…
 Each node has an owned block table.
 Contains an entry for each block for which the node is the
owner.

 A copy-set field in each entry


 Contains a list of nodes that currently have a valid copy of the
corresponding block.

 When the read fault occurs, the faulting node sends a


broadcast read request on the network.

 When a write fault occurs, the node sends a broadcast


write request on the network. 84
Centralized-server algorithm

85
Cont…
 Similar to the centralized-server algorithm of the NRMB
strategy.
 Each entry of the block table, managed by the
centralized server, has:
 an owner-node field.
 Indicates the current owner node of the block.
 a copy-set field
 Contains a list of nodes having a valid copy of the block.

86
Cont…

 When read/write fault occurs, the node send a


read/write fault request for the accessed block to the
centralized server.

87
Fixed distributed-server algorithm

88
Cont…
 Role of the centralized server is distributed to several
distributed servers.
 There is a block manager on several node.
 When fault occurs, the mapping function is used to find
the location of the block manager.

89
Dynamic distributed-server algorithm

90
Cont…
 Each node has a block table that contains an entry for all
block in the shared memory space.

 When a fault occurs, the fault handler of the faulting


node extracts the probable owner node information,
send a request for the block to that node.

 On receiving the block and copy set information, node


sends an in validation request to all nodes in the copy-
set.
91
Cont…
 To reduce the length of the chain of node to be
traversed to reach the true owner of a block, the
probable owner field of a block in a node is updated:
1. Whenever the node receives an invalidation request.
2. Whenever the node relinquishes ownership, that is on a
write fault.
3. Whenever the node forwards a fault request.

92
Cont…

 First two cases the probable owner field is changed to


the new owner of the block.

 Third case the probable owner field is changed to the


original faulting node.

93
Replicated, Non-migrating Block
 A shared memory block may be replicated at multiple
node of the system but the location of each replica is
fixed.
 All replicas of a block are kept consistent by updating
them all in case of a write access.
 Sequential consistency is ensured by using a global
sequencer to sequence the write operation of all nodes.

94
Data locating in the RNMB strategy
Following characteristics:
1. The replica location of a block never change.
2. All replicas of a data block are kept consistent.
3. Only a read request can be directly sent to one of the
node having a replica and all write requests have to
be sent to the sequencer.

95
96
Cont….
 The sequencer assign the next sequence number to
the requested modification.

 It multicasts the modification with this sequence


number to all the nodes listed in the replica set field.

97
Munin : A Release Consistent DSM
System
 Pg 259 – Pg 262

98
Replacement strategy
 DSM system allows shared memory block to
be dynamically migrated/replicated.

 Issues:
 Which block should be replaced to make space for
a newly required block?

 Where should the replaced block be placed?

99
Which block to replace?

 Classification of replacement algorithms:


1. Usage based verses non-usage based

2. Fixed space verses variable space

100
Cont…
Usage based algorithms
 Keep track of the history of usage of a cache line and use
this information to make replacement decisions.

 Example : Least recently used.

 Suitable for DSM


 Due to data access locality feature.

101
Cont…
Non-usage based algorithms
 Do
not take the record of use of cache lines into account
when doing replacement.

 Example:First in first out and Rand (random or


pseudorandom).

102
Cont…
Fixed space algorithms
 Assumes that the cache size is fixed.

Variable space algorithms


 Assumes that the cache size can be changed dynamically
depending on the need.
 A fetch does not imply a replacement, and a swap-out
can take place without a corresponding fetch.
 Not suitable for a DSM system.

103
Cont…
 Classification of each memory block of a
node:
 Unused
 a free memory block that is not currently being
used.

 Nil
 a block that has been invalidated.

104
Cont…

 Read-only
 a block for which the node has only read access
right.

 Read-owned
 a block for which the node has only read access
right but it also owner of the block.

 Writable
 a block for which the node has write access
permission.

105
Cont…
 Replacement priorities
1. Both unused and nil block have the highest
replacement priority.
2. The read-only block have the next replacement
priority.
3. Read-owned and writable block for which
replica(s) exist on some other node (s) have the
next replacement priority.
4. Read-owned and writable block for which only this
node has a copy have the lowest replacement
priority.
106
Where to place a replaced block
 Approaches for storing a useful block:
1. Using secondary store.
1. The block is simply transferred on to a local disk.
2. It does not waste any memory space.

2. Using the memory space of other nodes.


 Methods require each node to maintain a table of
free memory space in all other nodes.

107
Thrashing
 Occurs when the system spends a large amount of time
transferring shared data blocks from one node to
another.

 Degrades system performance considerably.

108
Cont…
 Situations for thrashing:
1. When interleaved data accesses is made by
processes on two or more node.

2. When blocks with read only permission are


repeatedly invalidated soon after they are
replicated.

109
Cont…

Methods for solving Thrashing problems:

1. Providing application controlled locks.

 Locking data to prevent other node from


accessing that data for a short period of time can
reduce Thrashing.

110
Cont…

2. Nailing a block to a node for a minimum


amount of time

1. Disallows a block to a be taken away from a node


until a minimum amount of time t elapses after its
allocation to that node.

2. Drawback: Difficult to choose the appropriate


value for the time t.

111
Cont…

3. Tailoring the coherence algorithm to the shared-


data usage patterns.

Thrashing can also be minimized by using different


coherence protocols for shared data having
different characteristics. For example, the
coherence protocol used in Munin for write shared
variables avoids the false sharing problem, which
ultimately results in the avoidance of thrashing.

112
Reading
 Pg 266 – Pg 270

113
Advantage of DSM
1. Simpler Abstraction
1. The shared memory programming paradigm shields
the application programmers from many such low
level concern.

2. Advantage:
1. Simpler abstraction to the application programmers of
loosely coupled distributed memory machine.

114
Advantage of DSM
2. Better portability of distributed application programs
 The access protocol is consistent with the way
sequential application access data.

 More natural transition from sequential to distributed


application.

 Worse performance of application if they use


message passing directly.

115
Cont…
3. Better performance of some application
 Some application using DSM can outperform their
message passing counterparts due to:

 Locality of data

 On-demand data movement

 Larger memory space

116
Cont…

 Locality of data
 The computation model of DSM is to make the data
more accessible by moving it around.

 Results in reduced overall communication cost for


such application.

117
Cont…
 On demand data moment
 The computation data modeled of DSM also
facilitates on demand moment of data as they are
being accessed.

 Time needed for the data exchange phase is often


dictated by the throughput of existing
communication bottlenecks.

 On demand data movements facility provided by


DSM eliminates the data exchange phase.

118
Cont…
4. Flexible communication environment
 The message passing paradigm requires recipients
identification and coexistence of the sender and
receiver processes.

 Here, the sender process need not specify the


identity of the receiver processes of data.

119
Cont…

5. Ease of process migration


 Migration of a process is tedious and time
consuming.

 The computation model of DSM provides the facility


of on demand migration of data between
processors.

120

You might also like