Lect5 - Distributed Shared Memory
Lect5 - Distributed Shared Memory
Memory
Distributed Operating
Systems
Distributed Shared Memory
( DSM )
1
Distributed shared memory
2
Cont….
Shared memory exists only virtually.
Similar concept to virtual memory.
DSM also known as DSVM.
DSM provides a virtual address space shared among
processes on loosely coupled processors.
DSM is basically an abstraction that integrates the local
memory of different machine into a single logical entity
shared by cooperating processes.
3
Distributed shared memory
4
DSM Architecture
Each node of the system consist of one or more CPUs
and memory unit.
Nodes are connected by high speed communication
network.
Simple message passing system for nodes to exchange
information.
Main memory of individual nodes is used to cache pieces
of shared memory space.
Reduces network latency
5
Cont….
Memory mapping manager routine maps local memory
to shared virtual memory.
Shared memory space is partitioned into blocks.
Data caching is used in DSM system to reduce network
latency.
The basic unit of caching is a memory block.
6
Cont….
If data is not available in local memory network block
fault is generated.
7
Design and implementation issues
1. Granularity
2. Structure of Shared memory
3. Memory coherence and access synchronization
4. Data location and access
5. Replacement strategy
6. Thrashing
7. Heterogeneity
8
Granularity
9
Structure of Shared memory
10
Memory Coherence & Access Synchronization
Replicated and shared data items may
simultaneously be available in the main memories of
a number of nodes.
11
Data location and access
12
Replacement strategy
Ifthe local memory of a node is full, a cache miss at that
node implies not only a fetch of accessed data block from
a remote node but also a replacement.
13
Thrashing
14
Heterogeneity
15
Granularity
16
Cont…
Factors influencing block size selection:
1. Paging overhead
2. Directory size
3. Thrashing
4. False sharing
17
Paging overhead
18
Directory size
19
Thrashing
20
False sharing
21
Cont…
22
Cont…
Using page size as block size
24
Structure of Shared-Memory Space
25
Cont…
Three approaches:
1. No structuring
2. Structuring by data type
3. Structuring as a database
26
No structuring
Advantages:
Convenient to choose any suitable page size as the
unit of sharing.
A fixed grain size may be used for all application.
Simple and easy to design such a DSM system.
27
Structuring by data type
The shared memory space is structured as a collection of
objects in the source language.
28
Structuring as a database
Structures the shared memory like a database.
29
Consistency Models
Consistency requirement vary from application to
application.
30
Types of Consistency Models
1. Strict Consistency model
2. Sequential Consistency model
3. Casual consistency model
4. Pipelined Random Access Memory consistency
model (PRAM)
5. Processor Consistency model
6. Weak consistency model
7. Release consistency model
31
Strict Consistency Model
The strongest form with most stringent consistency
requirement.
32
Sequential Consistency Model
Proposed by Lamport in 1979.
33
Cont…
Example :
Operations performed in order
1. Read(r1)
2. write(w1)
3. Read(r2)
34
Cont…
35
Cont…
The consistency requirement of the sequential
consistency model is weaker than that of the strict
consistency model.
36
Casual Consistency Model (PCR)
Proposed by hutto and ahemad in 1990.
37
Pipelined Random Access Memory
(PRAM) Consistency model
Proposed by Lipton and Sandberg in 1988.
A weaker consistency semantics.
Ensures that all write operations performed by a single
process are seen by all other processes in the order in
which they were performed as if all write operations
performed by a single process are in a pipeline.
38
PRAM
P1 – W11 & W12
P2 – W21 & W22
P3 can see these as ((W11,W12), (W21,W22))
P4 can see these as ((W21,W22), (W11,W12))
39
Cont…
Advantages:
Simple and easy to implement.
Good performance
Limitation:
All processes may not agree on the same order of memory
reference operations.
40
Processor Consistency model
Proposed by Goodman in 1989.
Memory coherence:
For any memory location, all processes agree on the same order
of all write operation to that location.
42
Cont…
Idea of weak consistency:
Better performance can be achieved if consistency
is enforced on a group of memory reference
operations rather than on individual memory
reference operations.
43
Cont…
Requirements:
1. All accesses to synchronization variables must obey
sequential consistency semantics.
2. All previous write operations must be completed
everywhere before an access to a synchronization
variable is allowed.
3. All previous accesses to synchronization variables must
be completed before access to a non-synchronization
variable is allowed.
4. Better performance at the cost of putting extra burden on the
programmers.
44
Release consistency model
45
Cont…
Acquire
Used to tell the system that a process is
entering Critical section.
46
Cont…
Release
Used to tell the system that a process has just
exited critical section.
47
Cont…
A variation of release consistency is lazy
release consistency proposed by Keleher in
1992.
Better performance.
48
Implementing sequential consistency
model
Protocol for implementing depends on
whether the DSM system allows:
replication & / or
Migration
Of shared memory data blocks.
49
Cont…
Strategies:
1. Nonreplicated, Nonmigrating blocks (NRNMB)
2. Nonreplicated, Migrating blocks (NRMB)
3. Replicated, migrating blocks (RMB)
4. Replicated, Nonmigrating blocks (RNMB)
50
NRNMBS
Simplest strategy.
51
Cont…
52
Cont…
Enforcing sequential consistency is trivial.
Drawbacks:
Serializing data access creates a bottleneck.
53
Cont…
Data locating in the NRNMB strategy:
There is a single copy of each block in the entire
system.
54
NRMBS
Each block of the shared memory has a single
copy in the entire system.
Migration is allowed.
Owner node of a block changes as soon as the
block is migrating to a new node.
56
Cont…
Advantage :
No communications cost are incurred when a
process accesses data currently held locally.
Advantage of data access locality.
Drawbacks:
Prone to thrashing problem.
No parallelism.
57
Data locating in the NRMB strategy
58
Cont…
Following method used:
1. Broadcasting
2. Centralized server algorithm
3. Fixed distributed server algorithm
4. Dynamic distributed server algorithm
59
Broadcasting
60
Cont…
61
Cont….
On a fault,
Fault handler of the faulting node broadcasts a
read/write request on the network.
Disadvantage:
Not scalable.
62
Centralized server algorithm
63
Cont…
A centralized server maintains a block table
that contains the location information for all
block in the shared memory space.
Drawbacks:
A centralized server serializes location
queries, reducing parallelism.
65
Cont..
A direct extension of the centralized server
scheme.
67
Cont…
Does not use any block manager and
attempts to keep track of the ownership
information of all block in each node.
69
Cont…
Replication tends to increase the cost of write
operation.
Problem to maintain consistency.
70
Cont…
Protocols to ensure sequential consistency:
1. Write-invalidate
2. Write-update
71
Write-invalidate
All copies of a piece of data except one are invalidated
before a write can be performed on it.
72
Cont…
73
Write-update
A write operation is carried out by updating all copies of
the data on which the write is performed.
74
Cont…
75
Cont…
76
Cont…
77
Cont…
Write-update approach is very expensive.
78
Cont…
Read request:
If there is a local block containing the data and if it
is valid, the request is satisfied by accessing the
local copy of data.
79
Cont…
Write request:
If there is a local block containing the data and if it
is valid and writable, the request is immediately
satisfied by accessing the local copy of the data.
80
Data Locating in the RMB strategy
Write-invalidate protocol
81
Cont…
Following algorithms may be used:
1. Broadcasting
2. Centralized-server algorithm
3. Fixed distributed-server algorithm
4. Dynamic distributed-server algorithm
82
Cont… : Broadcasting
83
Cont…
Each node has an owned block table.
Contains an entry for each block for which the node is the
owner.
85
Cont…
Similar to the centralized-server algorithm of the NRMB
strategy.
Each entry of the block table, managed by the
centralized server, has:
an owner-node field.
Indicates the current owner node of the block.
a copy-set field
Contains a list of nodes having a valid copy of the block.
86
Cont…
87
Fixed distributed-server algorithm
88
Cont…
Role of the centralized server is distributed to several
distributed servers.
There is a block manager on several node.
When fault occurs, the mapping function is used to find
the location of the block manager.
89
Dynamic distributed-server algorithm
90
Cont…
Each node has a block table that contains an entry for all
block in the shared memory space.
92
Cont…
93
Replicated, Non-migrating Block
A shared memory block may be replicated at multiple
node of the system but the location of each replica is
fixed.
All replicas of a block are kept consistent by updating
them all in case of a write access.
Sequential consistency is ensured by using a global
sequencer to sequence the write operation of all nodes.
94
Data locating in the RNMB strategy
Following characteristics:
1. The replica location of a block never change.
2. All replicas of a data block are kept consistent.
3. Only a read request can be directly sent to one of the
node having a replica and all write requests have to
be sent to the sequencer.
95
96
Cont….
The sequencer assign the next sequence number to
the requested modification.
97
Munin : A Release Consistent DSM
System
Pg 259 – Pg 262
98
Replacement strategy
DSM system allows shared memory block to
be dynamically migrated/replicated.
Issues:
Which block should be replaced to make space for
a newly required block?
99
Which block to replace?
100
Cont…
Usage based algorithms
Keep track of the history of usage of a cache line and use
this information to make replacement decisions.
101
Cont…
Non-usage based algorithms
Do
not take the record of use of cache lines into account
when doing replacement.
102
Cont…
Fixed space algorithms
Assumes that the cache size is fixed.
103
Cont…
Classification of each memory block of a
node:
Unused
a free memory block that is not currently being
used.
Nil
a block that has been invalidated.
104
Cont…
Read-only
a block for which the node has only read access
right.
Read-owned
a block for which the node has only read access
right but it also owner of the block.
Writable
a block for which the node has write access
permission.
105
Cont…
Replacement priorities
1. Both unused and nil block have the highest
replacement priority.
2. The read-only block have the next replacement
priority.
3. Read-owned and writable block for which
replica(s) exist on some other node (s) have the
next replacement priority.
4. Read-owned and writable block for which only this
node has a copy have the lowest replacement
priority.
106
Where to place a replaced block
Approaches for storing a useful block:
1. Using secondary store.
1. The block is simply transferred on to a local disk.
2. It does not waste any memory space.
107
Thrashing
Occurs when the system spends a large amount of time
transferring shared data blocks from one node to
another.
108
Cont…
Situations for thrashing:
1. When interleaved data accesses is made by
processes on two or more node.
109
Cont…
110
Cont…
111
Cont…
112
Reading
Pg 266 – Pg 270
113
Advantage of DSM
1. Simpler Abstraction
1. The shared memory programming paradigm shields
the application programmers from many such low
level concern.
2. Advantage:
1. Simpler abstraction to the application programmers of
loosely coupled distributed memory machine.
114
Advantage of DSM
2. Better portability of distributed application programs
The access protocol is consistent with the way
sequential application access data.
115
Cont…
3. Better performance of some application
Some application using DSM can outperform their
message passing counterparts due to:
Locality of data
116
Cont…
Locality of data
The computation model of DSM is to make the data
more accessible by moving it around.
117
Cont…
On demand data moment
The computation data modeled of DSM also
facilitates on demand moment of data as they are
being accessed.
118
Cont…
4. Flexible communication environment
The message passing paradigm requires recipients
identification and coexistence of the sender and
receiver processes.
119
Cont…
120