0% found this document useful (0 votes)
11 views138 pages

Module IV-MemorySystem

This document covers the concepts of memory systems in computer architecture, including types of memory such as RAM, ROM, cache, and secondary storage. It explains the memory hierarchy, the importance of locality in memory access, and the organization of main memory, emphasizing the role of registers and memory operations. Additionally, it discusses semiconductor memory types and their internal organization, providing insights into memory access speed and data transfer mechanisms.

Uploaded by

Pallavi Jayram
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views138 pages

Module IV-MemorySystem

This document covers the concepts of memory systems in computer architecture, including types of memory such as RAM, ROM, cache, and secondary storage. It explains the memory hierarchy, the importance of locality in memory access, and the organization of main memory, emphasizing the role of registers and memory operations. Additionally, it discusses semiconductor memory types and their internal organization, providing insights into memory access speed and data transfer mechanisms.

Uploaded by

Pallavi Jayram
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 138

COA –MODULE –IV

MEMORY SYSTEM

COA-MODULE -IV-MEMORY 1
1
SYSTEM
Text/Reference Books

The following sources are used for preparing these slides


1.Computer Organization , Carl Hamacher, Zvonko Vranesic, Safwat
Zaky McGraw Hill Publ.
2. Computer Organization and Architecture: Designing for
Performance, William Stallings , Prentice-Hall
India,Publ.
3. Computer Architecture A Quantitative
Approach ,John L Hennessy and David Patterson ,
Morgan Kaufman Publ.
4. Structured Computer Organization ,Andrew S.
Tanenbaum
, Prentice-Hall India Publ.
5.Computer Organization and Design, P. Paul
Choudhury
,Prentice-Hall India,Publ.
Websites:
COA-MODULE -IV-MEMORY
CO-UNIT-V-Memory System 2
SYSTEM
 In this Unit Module, we will be studying the following
Topics
i) Basic Concept of
Memory
ii) Semi Conductor RAM memories
iii) Semi conductor Read-Only
memories
iv) Cache Memory
Topic beyond The syllabus
Performance
Considerations
v) Secondary Storage
Topics beyond The syllabus
Concept of Virtual
memory
COA-MODULE -IV-MEMORY
Memory
SYSTEM
Management
CO-UNIT-V-Memory System 3
Introduction
 Digital computer works on stored programmed concept
introduced by Von Neumann.
 Memory is used to store the information, which includes both
program and data.
 Due to several reasons, we have different kind of memories.
We use different kind of memory at different level.
 The memory of computer is broadly categories into two
categories:
❍ Internal and ❍ external
 Internal memory is used by CPU to perform task and external
memory is used to store bulk information, which includes large
software and data.
 Memory is used to store the information in digital form. The
memory hierarchy is given by:
 ❍ Register
 ❍ Cache Memory
 ❍ Main Memory ❍ Magnetic Disk
 ❍ Removable media (Magnetic tape)
COA-MODULE -IV-MEMORY

CO-UNIT-V-Memory
SYSTEM
System 4
Memory Hierachy

 Programmers want unlimited amounts of memory with low


latency
 Fast memory technology is more expensive per bit than
slower memory
 Solution: organize memory system into a hierarchy
 Entire addressable memory space available in largest,
slowest memory
 Incrementally smaller and faster memories, each
containing a subset of the memory below it, proceed in
steps up toward the processor
 Temporal and spatial locality insures that nearly all
references can
be found in smaller memories
 Gives the allusion of a large, fast memory being
presented to the processor

COA-MODULE -IV-MEMORY
CO-UNIT-V-Memory System 5
.
SYSTEM
Exploiting the Memory Hierarchy

 Locality
 Spatial Locality:
• Data is more likely to be accessed if neighboring
data is accessed.
(e.g., data in a sequentially access array)
 Temporal Locality:
• Data is more likely to be accessed if it has
been recently accessed.
(e.g. code within a loop)

COA-MODULE -IV-MEMORY 6
CO-UNIT-V-Memory
SYSTEM System
MEMORY HIERARCHY

Auxiliary
memory
Magneti
c I/O Main
tapes processo memor
Magneti r y
c
disks
CPU Cache
memor
Storage Hierarchy – y
fastest CPU registers
Register Memory Hierarchy is to obtain
at top, slowest tape the highest possible access
drives at bottom speed while minimizing the
Cache total cost of the memory
system
Main Memory • Speed of memory access is
Decreasing frequency critical, the idea is to bring
of access of the Magnetic Disk
instructions and data that will be
memory by the used in the near future as close
processor to the processor as possible.
Magnetic Tape
COA-MODULE -IV-MEMORY 7
CO-UNIT-V-Memory System
SYSTEM
An Example Memory Hierarchy

Smaller, Pre-fetching
Data L0:
faster, CPU registers hold words retrieved
transferred registers
and between layers from L1 cache.
costlier is L1: on-chip L1
(per byte) usually bigger cache (SRAM) L1 cache holds cache lines retrieved
storage than requested. from the L2 cache memory.
devices This is to off-chip L2
anticipate cache (SRAM)
L2 cache holds cache lines
using retrieved from main memory.
the extra
blocks L2:L3: main memory
Larger, of data (DRAM)
Main memory holds disk
slower, blocks retrieved from local
and disks.
cheaper local secondary storage
L4:
(per byte)
storage
(local disks) Local disks hold files
devices retrieved from disks on
remote network servers.

L5: remote secondary storage


(tapes, distributed file systems, Web servers)
COA-MODULE -IV-MEMORY 8
CO-UNIT-V-Memory System
SYSTEM
Memory Hierarchy of a Modern Computer
System
 By taking advantage of the principle of locality:
 Present the user with as much memory as is available in the
cheapest
 technology.
Provide access at the speed offered by the fastest
technology.
Processor

Control
Tertiary
Secondary
Storage
Storage
Second Main (Tape)
(Disk)
Level
Registers

Memory
On-

Datapath
Cache
Chip

Cache (DRAM)
(SRAM)

L0 L1 L2 L3 L4 L5
Speed (ns): 1.0ns 2.5n 10ns 10ms 10s
0.5ns s
16KB 256KB 512MB 80GB 20TB
Size (bytes): 256B
COA-MODULE -IV-MEMORY 9 9
CO-UNIT-V-Memory System
SYSTEM
Memory Hierarchy
• Fastest access is to the data held
Processor
in processor registers. Registers
Registers are at the top of the memory
Increasin L0 hierarchy.
Increasing
g size Increasing speed • Relatively small amount of
Primar cost
memory that can be
ycach L1 per bit
e implemented on the processor
chip. This is processor cache.
• Two levels of cache. Level 1 (L1)
Secondar
y cache L2 cache is on the processor chip.
Level 2 (L2) cache is in between
main memory and processor.
• Next level is main memory,
Main
memory implemented as SIMMs. Much
L3 larger, but much slower than
cache memory.
• Next level is magnetic disks.
Magnetic
disk Huge amount of inexepensive
secondary storage.
memory • Speed of memory access is
L4
critical, the idea is to bring
COA-MODULE -IV-MEMORY
CO-UNIT-V-MEMORY SYSTEM 10
instructions and data that will 1be
SYSTEM
used in the near future as close0
Main Memory
 The main memory of a computer is semiconductor memory.
The main memory unit is basically consists of two kinds of
memory:
RAM (RWM):Random access memory; which is volatile in nature.
ROM: Read only memory; which is non-volatile.
 Permanent information is kept in ROM and the user space is
basically in RAM.
 The smallest unit of information is known as bit (binary digit)
 in one memory cell one bit of information can be stored.
8 bit together is termed as a byte.
 The maximum size of main memory that can be used in any
computer is determined by the addressing scheme.
 A computer that generates 16-bit address is capable of addressing
upto 216 which is equal to 64K memory location.
 Similarly, for 32 bit addresses, the total capacity will be 232
which is equal to 4G memory location.
 In some computer, the smallest addressable unit of
information is a memory word and the machine is called word-
COA-MODULE -IV-MEMORY 11
CO-UNIT-V-Memory
addressable . System 1
SYSTEM
1
Some basic concepts

 Maximum size of the Main


Memory
 byte-addressable
 CPU-Main Memory Connection

Processor Memory
k-
address
bit
MAR bus
n-bit
data
MDR bus Up to 2k
addressable
locations

Word length = n bits


Control lines
( R / W , MFC,
etc.)

COA-MODULE -IV-MEMORY 112


SYSTEM 2
Organization of the main
memory
Byte-
address

3 2 1 0 0 2 2 3
Word- 7 6 5 4 4 5 6 7
address 11 10 9 8 8 9 10 11
15 14 13 12 12 13 14 15

Big –endian Assignment Little –


endian Asignment
Fig. Organization of the main memory in a 32 bit-
byte addressable computer
31 30 2 1 0
29

Address of a Byte of a
word
COA-MODULE -IV-MEMORY word
CO-UNIT V-Memory 13
SYSTEM
System
Organization of the main
memory
 In some computer, individual address is assigned for each
byte of information, and it is called byte-addressable
computer.
In this computer, one memory word contains one or more
memory bytes which can be addressed individually.
 A byte addressable 32-bit computer, each memory word
contains 4 bytes. A possible way of address assignment is
shown in figure. The address of a word is always
integer multiple of 4.
 The main memory is usually designed to store and
retrieve data in word length quantities.
The word length of a computer is generally defined by the
number of
bits actually stored or retrieved in one main memory access.
 Consider a machine with 32 bit address bus. If the word size is
32 bit, then the high order 30 bit will specify the
address of a word. If we want to access any byte of the word,
then it can
COA-MODULE be specified by the lower two bit of the
-IV-MEMORY
CO-UNIT V-Memory 14
address bus.
SYSTEM
System
Some basic concepts(Contd.,)
◾ An important design issue is to provide a computer system with as
large and fast a memory as possible, within a given cost target.
◾ Several techniques to increase the effective size and speed of the
memory:
 Cache memory (to increase the effective speed).
 Virtual memory (to increase the effective size).
 The data transfer between main memory and the CPU takes place
through two CPU registers.
MAR : Memory Address Register and MDR : Memory Data
Register.
 If the MAR is k-bits, then the total addressable memory location will
be 2k.
 If the MDR is n-bits, then the n bit of data is transferred in one
memory cycle.
 In the above example, size of data bus is n-bit and size of address
bus is k bit.
 It also includes control lines like Read, Write and Memory Function
Complete (MFC) for coordinating data transfer.
COA-MODULE
In the case of byte addressable computer, another control line to15
-IV-MEMORY be
added to indicate the byte transfer instead of the whole word.
SYSTEM
Memory Operation
CPU initiates a memory operation by loading the appropriate
data i.e., address to MAR.
 Memory read operation,
CPU sets the read memory control line to 1.
Then the contents of the memory location is brought to
MDR. The memory control circuitry indicates this to the CPU
by setting
MFC to 1.
 Memory write operation
CPU places the data into MDR .
Sets the write memory control line to 1.
Once the contents of MDR are stored in specified memory
location, then the memory control circuitry indicates the end
of operation by setting MFC to 1.

COA-MODULE -IV-MEMORY
CO-UNIT V-MEMORY System 116
SYSTEM
6
Measures for the speed of a memory:
 Memory Access Time
A useful measure of the speed of memory unit is the time that
elapses between the initiation of an operation and the
completion of the operation .
(Ex. The time between Read and MFC)
 Memory cycle time
This is the minimum time delay between the initiation two
independent memory operations
(Ex. two successive memory read operation).
 Memory cycle time is slightly larger than memory access
time.

COA-MODULE -IV-MEMORY
CO-UNIT V-MEMORY 117
SYSTEM
System 7
Semiconductor RAM
memories

COA-MODULE -IV-MEMORY
118
SYSTEM
8
Internal organization of memory

chips
Each memory cell can hold one bit of information.
 Memory cells are organized in the form of an array.
 One row is one memory word.
 All cells of a row are connected to a common line, known as the
“word
line”.
 Word line is connected to the address decoder.
 Sense/write circuits are connected to the data input/output lines of
the memory chip.
 The storage part is modelled here with SR-latch, but in reality
it is an electronics circuit made up of transistors.
 The memory constructed with the help of transistors is
known as semiconductor memory.
 The semiconductor memories are termed as Random Access
Memory(RAM), because it is possible to access any memory
location in random.
 Depending on the technology used to construct a RAM, there
are two types of RAM : SRAM: Static Random Access Memory.
COA-MODULE -IV-MEMORY
CO-UNIT-V-MEMORY SYSTEM
DRAM: Dynamic Random Access Memory. 119
SYSTEM 9
Memory Cell Operation

Control input to binary


cell
Select Read/Write Memory
Operation
0 x None
1 0 Read
1 1 Write

COA-MODULE -IV-MEMORY
CO-UNIT-V-Memory 20
SYSTEM
system
Chip Organization

 Consider an individual memory cell. Select line indicates if


active, Control line indicates read or write.
Let’s say that each cell outputs 4 bits (i.e. word size=4 bits),
and we would like to hook four of these together for a 4 word
memory…

Control (WR)

Cell
Select Data In / Data Out
(CS) (sense)

COA-MODULE -IV-MEMORY 21
SYSTEM
Simplified Representation-4 bit memory

 What one would see if


this
was packaged together?

COA-MODULE -IV-MEMORY 22
SYSTEM
Four Word Memory, 4 bits
per word

Memory
addresses:
0 A1=0,
1 A0=0
2 A1=0,
3 A0=1
A1=1,
Datain : AD3 D2 D1
0=0
D0
A1=1,
A =1
Dataout: Q0 3, Q2, Q1,
Q0
Decoder selects only one memory
cell
COA-MODULE -IV-MEMORY 23
SYSTEM
2n-Word  1-Bit RAM IC
 To build a RAM IC 4-to-16 Word select
from a RAM A3 3
0
A3 2 1
slice, we need: 2 RAM cell
A2 2 3
– Decoder decodes A2 2
4
the n address lines A1 A1 21
5
6 RAM clel
ton word select
2 A0 A0 2
0 7
8
– lines
A 3-state buffer 16
1
x Decoder
9
RAM 10
on the data 11
 As output.
memory arrays can be Data
input
Data
output
12
13
very 14
15
large ,We need large decoders Read/
RAM cell
 The decoder size and Write

Memory
bfanouts canabe reduced by R/w
n using coincident enable logic

yapproximately
in aselection
2-dimensional Data input Data in
(a) Symbol Data
Dataout
array.
– Uses two decoders, one for words and one for oRu/wt
output
Bit
bits Read/Write
select

– Word select becomes Row select Chip select


(b) Block diagram
– Bit select becomes Column select
COA-MODULE -IV-MEMORY 24
SYSTEM
Internal organization of memory chips
(Contd.,)
7 7 1 1 0 0
W0




FF FF
A 0
W1




A 1
Addres • • • • • • Memor
s • • • • • • y
A 2 • • • • • • cells
decode
r
A 3

16 words of 8 bits
each: It has 16 W15



external
connections: addr.
4, data 8, R/
control: 2, Sense / Sense / Sense /
Write Write Write
power/ground: 2 circuit circuit circuit W

1K memory cells: CS
128x8 memory,
external connections:
b7 b1 b0
? 19 (7+8+2+2)
Data input/output Fig.16x8 memory
1Kx1:? 15
COA-MODULE -IV-MEMORY
CO-UNIT-V-MEMORY
(10+1+2+2) SYSTEM
lines: org. 252
SYSTEM 5
Cell Arrays and Coincident Selection
(continued) 2-to-4
Decoder 0
Example A3 21
RAM cell RAM cell RAM cell RAM cell
For address 1001: A2
20
0 1 2 3

1
10 selects row 2
Row RAM cell RAM cell RAM cell RAM cell
01 selects column select 4 5 6 7

1 Cell 9 is 2

Address .
accessedLines :A3 A2 A1
RAM cell
8
RAM cell
9
RAM cell
10
RAM cell
11

A 0 3
Row RAM cell RAM cell RAM cell RAM cell
select: A3A2 12 13 14 15
Column select: Row R/w R/w Read/Write Read/Write
logic
A1A0 decoder
logic logic logic

Data in Data in Data in Data in


Data out Data out Data out Data out
Read/ Bit Read/ Bit Read/ Bit Read/
Write select Write select Write select Write select
Bit
Data input
Read/Write

X X X X
Column select Data
Fig.16x1 memory 0 1 2 3 output

org Column 2-to-4 Decoder


decoder with enable

21 Enable
COA-MODULE -IV-MEMORY 26
SYSTEM 20 Chip select
A Memory Chip

5-bit row
addres W0
s
W1
32  32
5-bit
decode memory
r cell
array
W31 Sense/
Write
circuitry

10-bit
addres
s 32-to-
1 R/
output
W
multiplexer
and
CS
input
demultiplexer
5-bit
column
address
Data
Fig. Organization of a 1K  1 memory input/outp
ut
chip.
COA-MODULE -IV-MEMORY 27
SYSTEM
Static Memories

 The circuits are capable of retaining their state as


long as power is applied.

Fig. A static RAM cell.


b b

T1 T2
X Y

Word
line
Bit lines
COA-MODULE -IV-MEMORY 28
SYSTEM
SRAM
Cell
A static RAM cell (Transisitor Latch)
 Two transistor inverters are cross connected to implement a
basic flip-flop (latch).
 The latch is connected to one word line and two bits lines by
transistors T1 and T2.
 T1 and T2 act as swithes that can be opened or closed
under the control of the word line.
 When word line is at ground level, the transistors are
turned off and
the latch retains its state.
Ex.The cell is in state 1 if the logic value at X is 1 and at Y is 0.
 How to read state of SRAM cell:
The word line is activated to close switches T1 and T2.
If the cell is in state 1, the signal on bit line b is high and the
signal at bit line b’ is low. The opposite is true if the cell is in
state 0 Thus b and b’ are complements of each other.
Sense/Write circuits at the end of the bit lines monitor the
COA-MODULE
state of b-IV-MEMORY
and SYSTEM
CO-UNIT-V-MEMORY b’ and set the output accordingly. 229
SYSTEM 9
SRAM cell.

 Write operation:
The state of the cell is set by placing the appropriate value on
bit b and b’, and then activating the word line.
This forces the cell into the corresponding state.
The required signals on the bit lines are generated by
sense /write circuit.

COA-MODULE -IV-MEMORY
Chapter 2 - Machine Instructions & 330
SYSTEM
0
Static Memories-CMOS Cell

 CMOS cell: low power consumption

Transistor pairs (T3,T5) and (T4,T6)


form
inverters.
In state 1,the voltage at X is
maintained high by having T3 and T6
on ,while T4 and T5 are off,
Thus, if T1 and T2 are turned on
(closed), bit lines b and b’ will have
high and low signals respectively.
Advantage of using CMOS:

Low power
consumption
Advantage of using Static RAM
cell : Access time is less

COA-MODULE -IV-MEMORY 31
SYSTEM
Asynchronous DRAMs vs SRAMs

 Static RAMs (SRAMs):


 Consist of circuits that are capable of retaining their state as
long as the power is applied.
 Volatile memories, because their contents are lost when
power is
interrupted.
 Access times of static RAMs are in the range of few
nanoseconds. (fast)
 However, the cost is usually high.
 Cache memory
 Dynamic RAMs (DRAMs):
 Do not retain their state indefinitely.
 Contents must be periodically refreshed.
 Contents may be refreshed while accessing them for reading.
 Main memory
COA-MODULE -IV-MEMORY 332
SYSTEM 2
Asynchronous DRAMs vs SRAMs

 Both static and dynamic RAMs are volatile, that is, it


will retain the information as long as power supply is
applied.
 A dynamic memory cell is simpler and smaller than a
static memory cell.
Thus a DRAM is more dense, i.e., packing density is
high
(more cell per unit area).
DRAM is less expensive than corresponding SRAM.
 DRAM requires the supporting refresh circuitry.
For larger memories, the fixed cost of the refresh
circuitry is more than compensated for by the less cost
of DRAM cells.
 SRAM cells are generally faster than the DRAM cells.
Therefore, to construct faster memory modules (like
cache memory) SRAM is used.
COA-MODULE -IV-MEMORY
CO-UNIT V-Memory 333
SYSTEM
System 3
Asynchronous DRAMs
A dynamic memory cell consists of a capacitor and a transistor. In order to
store Information in this cell ,transisitor T is turned on and appropriate
voltage is applied to the bit line.This caused know amount of charge to be
stored in the capacitor.
Asfter the transistor is turned off the capacitor begins to discharge.
Hence the information stored in the cell need to be retrieved.
Bit line

Word line

T
C

Fig. A single-transistor dynamic memory


COA-MODULE -IV-MEMORY cell 34
SYSTEM
Asynchronous DRAMs

 Each row can store 512


bytes. 12 bits to select
RAS a row, and 9 bits to
select a group in a row.
Row Total of 21 bits.
Row 4096
addres
slatc
decode 5128 • First apply the row
r cell array
h address, RAS signal
latches the row address.
Then apply the column
CS
A20- 9  A8- Sense / address, CAS signal
Write R/
0
circuits latches the address.
W
• Timing of the memory
Colum
Colum
unit is controlled by a
n
addres
n
decode
specialized unit which
s latch r generates RAS and CAS.
CAS D
• This is asynchronous
7
D
Fig. Internal DRAM
organization of a 2M x 80 dynamic memory
chip.
COA-MODULE -IV-MEMORY 35
SYSTEM
Fast Page Mode

◾ Suppose if we want to access the consecutive bytes


in the selected row.
◾ This can be done without having to reselect the row.
 Add a latch at the output of the sense circuits in
each row.
 All the latches are loaded when the row is selected.
 Different column addresses can be applied to select and
place different bytes on the data lines.
◾ Consecutive sequence of column addresses can be applied
under the control signal CAS, without reselecting the row.
 Allows a block of data to be transferred at a much faster
rate than random accesses.
 A small collection/group of bytes is usually referred to
as a block.
◾ This transfer capability is referred to as the fast page
mode feature.

COA-MODULE -IV-MEMORY 36
SYSTEM
Synchronous DRAMs
• Operation is directly
synchronized with processor
clock signal.
Refres
h • The outputs of the sense circuits
counte
r are connected to a latch.
• During a Read operation, the
contents of the cells in a row
Row
addres Row Cell array are loaded onto the latches.
decode • During a refresh operation, the
s latch r
Row/ contents of the cells are
Column refreshed without changing the
address Colum contents of the latches.
n • Data held in the latches
addres
s Column Read/Write correspond to the selected
counte decode circuits &
r latches
columns are transferred
r
to the output.
Cloc • For a burst mode of operation,
k
Mode successive columns are selected
RA SS Data Data
CA register
input output
using column address counter and
R/ W and register register clock. CAS signal need not be
timing control
CS generated externally. A new data is
placed during raising edge of the
clock
Dat
COA-MODULE -IV-MEMORY a
37
SYSTEM
Latency and Bandwidth

 The speed and efficiency of data transfers among memory,


processor, and disk have a large impact on the
performance of a computer system.
 Memory latency – the amount of time it takes to transfer a
word of data to or from the memory.
 Memory bandwidth – the number of bits or bytes that can
be transferred in one second. It is used to measure how
much time is needed to transfer an entire block of data.
 Bandwidth- is not determined solely by memory. It is the
product of the rate at which data are transferred (and
accessed) and the width of the data bus.

COA-MODULE -IV-MEMORY 38
SYSTEM
Latency, Bandwidth, and DDRSDRAMs

 Memory latency is the time it takes to transfer a word of


data to or from memory
 Memory bandwidth is the number of bits or bytes that
can be transferred in one second.
 DDRSDRAMs
 Cell array is organized in two banks

COA-MODULE -IV-MEMORY
CO-UNIT-V-MEMORY SYSTEM 339
SYSTEM 9
Constructing Wider
Memory

 Can pair two of our 4 word x 4 bit chips


to make a 4 word x 8 bit chip : Use
both in parallel

COA-MODULE -IV-MEMORY 40
SYSTEM
Constructing Longer
Memory

 We can combine
chips to create a
8 word x 4 bit
memory. Third
address bit goes
to a decoder to
select only one of
the two chips.

COA-MODULE -IV-MEMORY 41
SYSTEM
Structure of large memories: Static
memories Implement a memory unit of
21-bit 2M words of 32 bits each.
addresse 19-bit internal chip address
A0 s Use 512x8 static memory
A1
chips. Each column consists
A19 of 4 chips. Each chip
A20
implements one byte
position.
A chip is selected by setting
its chip select control line to
2-bit 1.
decode
r Selected chip places its data on
the data output line, outputs of
other chips are in high
512K  8
memory impedance state.
D31-24 D23- D15- D7-0
chip
16 8 21 bits to address a 32-bit
512 K 8 memory
chip word. High order 2 bits are
19-bit 8-bit data
needed to select the row, by
addres
s
input/outp
ut activating the four Chip Select
signals.
Fig. Organization of 2M x Chip 19 bits are used to access
select
32 specific byte locations inside
COA-MODULE -IV-MEMORY the selected chip. 42
SYSTEM
Large memories: Dynamic memories

◾ Large dynamic memory systems can be implemented using


DRAM chips in a similar way to static memory systems.
◾ Placing large memory systems directly on the motherboard will
occupy
a large amount of space.
 Also, this arrangement is inflexible since the memory system
cannot be expanded easily.
◾ Packaging considerations have led to the development of
larger memory units known as SIMMs (Single In-line Memory
Modules) and DIMMs (Dual In-line Memory Modules).
◾ Memory modules are an assembly of memory chips on a
small board that plugs vertically onto a single socket on the
motherboard.
 Occupy less space on the motherboard.
 Allows for easy expansion by replacement

COA-MODULE -IV-MEMORY 43
SYSTEM
Memory controller

◾ To reduce the number of pins, the dynamic memory


chips
use multiplexed address inputs.
◾ Address is divided into two parts:
 High-order address bits select a row in the array.
 They are provided first, and latched using RAS signal.
 Low-order address bits select a column in the row.
 They are provided later, and latched using CAS signal.
◾ However, a processor issues all address bits at the same
time.
◾ In order to achieve the multiplexing, memory controller
circuit
is inserted between the processor and memory.

COA-MODULE -IV-MEMORY 44
SYSTEM
Memory controller (contd..)

Row/
address
Column
Addres
s
RAS
R/
Memory CAS
W
Reques controller R/ W
Processor t CS Memory
Cloc
k Clock

Dat
a

COA-MODULE -IV-MEMORY 45
SYSTEM
Read-Only
Memories
(ROMs)

COA-MODULE -IV-MEMORY 46
SYSTEM
Read-Only Memories (ROMs)

◾ SRAM and SDRAM chips are volatile:


 Lose the contents when the power is turned off.
◾ Many applications need memory devices to retain contents
after the power is turned off.
 For example, computer is turned on, the operating system
must be loaded from the disk into the memory.
 Store instructions which would load the OS from the disk.
 Need to store these instructions so that they will not be
lost after the power is turned off.
 We need to store the instructions into a non-volatile
memory.
◾ Non-volatile memory is read in the same manner as
volatile memory.
 Separate writing process is needed to place information
in this
memory.
COA-MODULE
 Normal -IV-MEMORY
operation involves only reading of data, this 447
SYSTEM 7
Read-Only-Memory

 Volatile / non-volatile
memory
 ROM
 PROM: programmable ROM
Bit line
Word
line

P Connected to store a 0
Not connected to store a
1
Fig. A ROM
cell.
COA-MODULE -IV-MEMORY 48
SYSTEM
Read-Only Memories (Contd.,)

◾ Read-Only Memory:
 Data are written into a ROM when it is manufactured.
◾ Programmable Read-Only Memory (PROM):
 Allow the data to be loaded by a user.
 Process of inserting the data is irreversible.
 Storing information specific to a user in a ROM is
expensive.
 Providing programming capability to a user may be
better.
◾ Erasable Programmable Read-Only Memory (EPROM):
 Stored data to be erased and new data to be loaded.
 Flexibility, useful during the development phase of
digital
systems.
 Erasable, reprogrammable ROM.
 Erasure requires exposing the ROM to UV light.
COA-MODULE -IV-MEMORY 449
SYSTEM 9
Read-Only Memories (Contd.,)

◾ Electrically Erasable Programmable Read-Only Memory


(EEPROM):
 To erase the contents of EPROMs, they have to be
exposed to ultraviolet light.
 Physically removed from the circuit.
 EEPROMs the contents can be stored and erased
electrically.
◾ Flash memory:
 Has similar approach to EEPROM.
 Read the contents of a single cell, but write the contents
of an entire block of cells.
 Flash devices have greater density.

Higher capacity and low storage cost per bit.


 Power consumption of flash memory is very low, making it
attractive for use in equipment that is battery-driven.
 Single flash chips are not sufficiently large,

so larger memory modules are 550


COA-MODULE -IV-MEMORY
implemented using flash cards and flash
SYSTEM 0
Speed, Size, and Cost

◾ A big challenge in the design of a computer system is to provide a


sufficiently large memory, with a reasonable speed at an affordable
cost.
◾ Static RAM:
 Very fast, but expensive, because a basic SRAM cell has a
complex circuit making it impossible to pack a large number of
cells onto a single chip.
◾ Dynamic RAM:
 Simpler basic cell circuit, hence are much less expensive, but
significantly slower than SRAMs.
◾ Magnetic disks:
 Storage provided by DRAMs is higher than SRAMs, but is still less
than what is necessary.
 Secondary storage such as magnetic disks provide a large amount

of storage, but is much slower than DRAMs.

COA-MODULE -IV-MEMORY 551


SYSTEM 1
Cache
Memories

COA-MODULE -IV-MEMORY 52
SYSTEM
Cache Memories

◾ Processor is much faster than the main memory.


 As a result, the processor has to spend much of its time
waiting while instructions and data are being fetched from
the main memory.
 Major obstacle towards achieving good performance.
◾ Speed of the main memory cannot be increased beyond a
certain point.
◾ Cache memory is an architectural arrangement which
makes the main memory appear faster to the processor
than it really is.
◾ Cache memory is based on the property of computer
programs
known as “locality of reference”.

COA-MODULE -IV-MEMORY 53
SYSTEM
Locality of Reference

◾ Analysis of programs indicates that many instructions in


localized areas of a program are executed repeatedly during
some period of time, while the others are accessed relatively
less frequently.
 These instructions may be the ones in a loop, nested loop
or few
procedures calling each other repeatedly.
 This is called “locality of reference”.
◾ Temporal locality of reference:
 Recently executed instruction is likely to be executed
again very soon.
◾ Spatial locality of reference:
 Instructions with addresses close to a recently instruction
are likely
to be executed soon.

COA-MODULE -IV-MEMORY 54
SYSTEM
Cache memories

Main
Processor Cache
memor
y

• Processor issues a Read request, a block of words is transferred


from
the main memory to the cache, one word at a time.
• Subsequent references to the data in this block of words are
found in the cache.
• At any given time, only some blocks in the main memory are
held in the cache. Which blocks in the main memory are
in the cache is determined by a “mapping function”.
• When the cache is full, and a block of words needs to be
transferred from the main memory, some block of words
in the cache must be replaced. This is determined by a
COA-MODULE -IV-MEMORY SYSTEM 55
“replacement algorithm”.
Cache hit

• Existence of a cache is transparent to the processor.


The processor issues Read and
Write requests in the same manner.
• If the data is in the cache it is called a Read or Write
hit.
• Read hit:
 The data is obtained from the cache.
• Write hit:
 Cache has a replica of the contents of the main
memory.
 Contents of the cache and the main memory may be
updated simultaneously. This is the write-
through protocol.
 Update the contents of the cache, and mark it as updated
by setting a bit known as the dirty bit or modified
bit. The contents of the main memory are updated
when this block is replaced. This is write-back or copy-
back-IV-MEMORY
COA-MODULE protocol. 56
SYSTEM
Cache miss
• If the data is not present in the cache, then a Read miss or Write
miss occurs.
• Read miss:
 Block of words containing this requested word is transferred
from
the memory.
 After the block is transferred, the desired word is forwarded to
the processor.
 The desired word may also be forwarded to the processor as
soon as it is transferred without waiting for
the entire block to be transferred. This is called load-through or
early-restart.
• Write-miss:
 Write-through protocol is used, then the contents of
the main memory are updated directly.
 If write-back protocol is used, the block containing the

addressed word is first brought into the cache. The desired


word is overwritten
COA-MODULE -IV-MEMORY with new information. 57
SYSTEM
Cache Coherence Problem
 A bit called as “valid bit” is provided for each block.
 If the block contains valid data, then the bit is set to 1, else it is 0.
 Valid bits are set to 0, when the power is just turned on.
 When a block is loaded into the cache for the first time, the
valid bit is set to 1.
 Data transfers between main memory and disk occur directly
bypassing the cache.
 When the data on a disk changes, the main memory block
is also updated.
 However, if the data is also resident in the cache, then the
valid bit is
set to 0.
 What happens if the data in the disk and main memory changes
and the write-back protocol is being used?
 In this case, the data in the cache may also have changed
and is indicated by the dirty bit.
 The copies of the data in the cache, and the main memory
are
different. This is called the cache coherence problem.
 One option
COA-MODULE -IV-MEMORY
is to force a write-back before the main memory is 58
SYSTEM
Cache Design

 Size
 Mapping Function
 Replacement
Algorithm
 Write Policy
 Block Size
 Number of Caches

COA-MODULE -IV-MEMORY 59
SYSTEM

Tag Fields
• A cache line contains two fields
– Data from RAM
– The address of the block currently in the cache.
• The part of the cache line that stores the address of the block is
called the tag field.
Many different addresses can be in any cache line.
The tag specifies the address currently in the cache line.
Only the upper address bits are needed.

Cache Lines
• The cache memory is divided into blocks or lines. Currently lines
can
range from 16 to 64 bytes.
• Data is copied to and from the cache one line at a time.
•The lower log2(line size) bits of an address specify a particular
byte in line.

COA-MODULE -IV-MEMORY 60
SYSTEM
Line Example
01100101 With a line size of
00 4,
01100101
01

These011001011 the offset is


boxes 0 the log2(4) =
011001011 2 bits.
represent
1 The lower 2 bits specify
RAM
011001100 which byte in the line
addresses
0
011001100
1
011001101
0
011001101
1
01100111
00
COA-MODULE -IV-MEMORY01100111
661
SYSTEM 01 1
Mapping functions

◾ Mapping functions determine how memory blocks are placed


in the cache.
◾ Three mapping functions:
 Direct mapping
 Associative mapping
 Set-associative mapping.

COA-MODULE -IV-MEMORY
CO-UNIT-V-MEMORY SYSTEM 662
SYSTEM 2
Example

A system with 512 x 12 Cache and 32k X 12 of


main memory

COA-MODULE -IV-MEMORY SYSTEM 63


Direct Mapping

 Simplest mapping technique - each block of main memory maps


to only one cache line
i.e. if a block is in cache, it must be in one specific place
Main memory locations can only be copied into one location
in the cache, accomplished by dividing main memory into
blocks that correspond in size with the cache.
 Formula to map a memory block to a cache line:
 i = j mod c
• i=Cache Line Number (block address of cache)
• j=Main Memory Block Number(block address of main
memory)
• c=Number of Lines in Cache(no.of cache blocks)
 i.e. we divide the memory block by the number of cache
lines and the remainder is the cache line address

64
COA-MODULE -IV-MEMORY SYSTEM
1.Direct mapping

The direct mapping technique is simple and inexpensive to implement.


When the CPU wants to access data from memory, it places a
address. The index field of CPU address is used to access address.
The tag field of CPU address is compared with the associated tag in
the word read from the cache. If
the tag-bits of CPU address is matched with the tag-bits of cache,
then there is a hit and the required data word is read from cache.
If there is no match, then there is a miss and the required data
word is
stored in main memory.
It is then transferred from main memory to cache memory with the
new
tag.

65
COA-MODULE -IV-MEMORY SYSTEM
Direct Mapping-with word
transfers
 Direct Cache Addressing
 The lower log2(line size) bits define which byte in the block
 The next log2(number of lines) bits defines which line of the
cache
 The remaining upper bits are the tag field.
Cache Constants
 cache size / line size = number of lines
 log2(line size) = bits for offset
 log2(number of lines) = bits for cache index
 remaining upper bits = tag address bits
Given problem:
w=4 words=22
Divide the remaining main memory block size into two
parts.(s=22) r and (s-r) fields.
One part should be equal to cache block of cache called

Index field (r=14)


And other is (s-r) =8,called as tag field.
COA-MODULE -IV-MEMORY SYSTEM
66
1.Direct mapping

COA-MODULE -IV-MEMORY SYSTEM 67


1.Direct mapping

COA-MODULE -IV-MEMORY SYSTEM 68


Direct Mapping Example
ADDRESS
TAG I
NDEX
DATA of m.m block:4
(index:S)Size
Cache line:4 00 1 SOL
(t)No.of blocks of m.m:16/4=4 E
W=1 00 2 MARE
00 10 CAS
A
ADD V TAG DATA
1. 11 LUC
00 01 ABCD E ABC
1 10 D
ROMA
COMO 2. 00
1
1 00 CASA 01 01 BAR
01 1 11 I
SARA 01 10 ANNA
10
01
10 11
1 SAL
Cache: 4 w ords
11 and memory: 16 w
E
ords 10 2 COMO
The cache location 00 can be occupied by 10 10 PAR
data coming from the memory addresses 10 11 I
PEP
00 E
11 1 MANO
11 2 BI RO
11 10 DUCA
11 11 SARA

COA-MODULE -IV-MEMORY SYSTEM 69


How many bits are in the tag, line and offset
fields?
Example direct address
 Assume you have 32 bit addresses (m.m can address 4
GB) =232
 64 byte lines (26: offset is 6 bits)
 32 KB of cache
 Number of lines = 32 KB / 64 = 512 =29
 Bits 17
to specify
9 bitswhich
6 bitsline = log2(512) = 9
bits Line Offset
Tag

How many bits are in the tag, line and offset


fields? 24 bit addresses
64K bytes of
cache 16 byte
cache lines
• tag=4, line=16,
offset=4
• tag=4, line=14,
offset=6 SYSTEM
CO-UNIT-V-MEMORY
COA-MODULE -IV-MEMORY
•SYSTEM
tag=8, line=12, 770
0
Direct Mapping -Address Structure

Main memory
address
Tag Index/ line word

 24 bit address
w=2 bit word identifier (4 byte
block) Address field:
 22 bit block identifier s=14,
 8 bit tag (=22-14)
 14 bit slot or line
 No two blocks in the same line
have the same Tag field
 Check contents of cache by finding
line and checking Tag
COA-MODULE -IV-MEMORY SYSTEM 71
Direct Mapping from Cache to Main
Memory

COA-MODULE -IV-MEMORY SYSTEM 72


Direct Mapping Cache Line
Table

Cache line Main Memory blocks held

0 0, m, 2m, 3m…2s-m

1 1,m+1, 2m+1…2s-m+1

m-1 m-1, 2m-1,3m-1…2s-1

COA-MODULE -IV-MEMORY SYSTEM 73


Direct Mapping pros & cons

 Simple
 Inexpensive
 Fixed location for given block
 If a program accesses 2 blocks that map to the same
line repeatedly, cache misses are very high

74
COA-MODULE -IV-MEMORY SYSTEM
Direct mapping-example2

◾ A simple processor example:


 Cache consisting of 128 blocks of 16 words
each.
 Total size of cache is 2048 (2K) words.
 Main memory is addressable by a 16-bit
address.
 Main memory has 64K words.
 Main memory has 4K blocks of 16 words
each.

COA-MODULE -IV-MEMORY SYSTEM 775


5
Direct mapping

• Block j of the main memory maps to j


Main Block 0
memor
y modulo 128 of the cache. 0 maps to 0,
Cach Block 1 129 maps to 1.
tag e • More than one memory block is mapped onto
Block 0
tag the same position in the cache.
Block 1
• May lead to contention for cache blocks even
Block 127 if
Block 128 the cache is not full.
tag • Resolve the contention by allowing new
Block 127 Block 129
block to replace the old block, leading to a
trivial replacement algorithm.
• Memory address is divided into three fields:
- Low order 4 bits determine one of the 16
Block 255
Tag Bloc Wor words in a block.
k7 d4 Block 256 -
5 When a new block is brought into the
Main memory
Block 257 cache, the the next 7 bits determine
address which cache block this new block is
placed in.
- High order 5 bits determine which of
Block 4095 the possible 32 blocks is currently present
in the cache. These are tag bits.
COA-MODULE -IV-MEMORY • Simple to implement but not very flexible. 776
SYSTEM 6
Mapping Function – 64K Cache Example

 Suppose we have the following configuration


 Word size of 1 byte
 Cache of 64 KByte
 Cache line / Block size is 4 bytes
• i.e. cache is 64 Kb / 4 bytes = 16,384 (214) lines of 4
bytes
• 24memory
 Main bit of 16MBytes
• (2 24=16M)
address
• 16Mb / 4bytes-per-block  4 Meg of Memory
Blocks
 Somehow we have to map the 4 Meg of blocks in
memory onto the 16K of lines in the cache.
 Multiple memory blocks will have to map to the same line
in the cache!

77
COA-MODULE -IV-MEMORY SYSTEM
2.Fully Associative Mapping

An associative mapping uses an associative memory.

This memory is being accessed using its contents.

Each line of cache memory will accommodate the address


(main memory) and the contents of that address from
the main memory.
That is why this memory is also called content addressable
Memory
(CAM). It allows each block of main memory to be
stored in the cache.

COA-MODULE -IV-MEMORY SYSTEM 78


2. Fully Associative Mapping

 A fully associative mapping scheme can overcome the


problems of the direct mapping scheme
 A main memory block can load into any line of cache
 Memory address is interpreted as tag and word
 Tag uniquely identifies block of memory
 Every line’s tag is examined for a match
 Also need a Dirty and Valid bit
 But Cache searching gets expensive!
 Ideally need circuitry that can simultaneously examine all
tags for a match
 Lots of circuitry needed, high cost
 Need replacement policies now that anything can get thrown
out of the cache (will look at this shortly)

Tag word

COA-MODULE -IV-MEMORY SYSTEM 79


2. Fully Associative Mapping

80
COA-MODULE -IV-MEMORY SYSTEM
Associative Mapping

 A main memory block can load into any line of


cache
 Memory address is interpreted as tag and word
 Tag uniquely identifies block of memory
 Every line’s tag is examined for a match
 Cache searching gets expensive

81
COA-MODULE -IV-MEMORY SYSTEM
Associative Mapping
from Cache to Main
Memory

82
COA-MODULE -IV-MEMORY SYSTEM
Associative mapping

Mai
• Main memory block can be placed
memor Block 0
y
n
Block 1
into any cache position.
Cach
e
• Memory address is divided into 2
tag
Block 0
tag
fields:
Block 1 - Low order 4 bits identify the word
Block 127
within a block.
Block 128 - High order 12 bits or tag bits
tag
Block 127 Block 129 identify a memory block
when it is resident in the
cache.
• Flexible, and uses cache space
Block 255
Tag Word
Block 256
efficiently.
12 4
• Replacement algorithms can be used
Main memory
Block 257
address to replace an existing block in the
cache when the cache is full.
• Cost is higher than direct-mapped
Block 4095
cache because of the need to search
all 128 patterns to determine whether
COA-MODULE -IV-MEMORY
CO-UNIT-V-MEMORY SYSTEM a given block is in the cache. 883
SYSTEM 3
3.Set Associative Mapping

That is the easy control of the direct mapping cache


and the more flexible mapping of the fully associative
cache.

In set associative mapping, each cache location can


have
more than one pair of tag + data items.

That is more than one pair of tag and data are residing
the
at same location of cache memory. If one cache
location is holding
Tag two pair
Set of tag + word
data items, that
is called s2e-tway associative mapping.

84
COA-MODULE -IV-MEMORY SYSTEM
2-way Set Associative
Mapping

COA-MODULE -IV-MEMORY SYSTEM 85


:
Set
Associative

COA-MODULE -IV-MEMORY SYSTEM 86


3.Set Associative Mapping

 Compromise between fully-associative and direct-mapped


cache
 Cache is divided into a number of sets
 Each set contains a number of lines
 A given block maps to any line in a specific set
• Use direct-mapping to determine which set in the
cache corresponds to a set in memory
• Memory block could then be in any line of that set
 e.g. 2 lines per set
• 2 way associative mapping
• A given block can be in either of 2 lines in a specific
set
 e.g. K lines per set
• K way associative mapping
• A given block can be in one of K lines in a specific
set
• Much easier to simultaneously search one set than
87
all linesSYSTEM
COA-MODULE -IV-MEMORY
Set-Associative mapping
Blocks of cache are grouped into sets.
Cach
e
Mai
memor Block 0 Mapping function allows a block of the main
n
y Block 1 memory to reside in any block of a specific
Block 0
tata Block set. Divide the cache into 64 sets, with two
gtag
1
Block 2
blocks per set.
g
Memory block 0, 64, 128 etc. map to block 0,
Block 63 and they can occupy either of the two
tag Block 3
Block 64
positions.
tag
Memory address is divided into three fields:
Block 126 Block 65 - 6 bit field determines the set number.
- High order 6 bit fields are compared to
the tag fields of the two blocks in
tag
Block 127 a set.
Block 127
Tag Wor Set-associative mapping combination of
Block
5 7 d4 Block 128
direct and associative mapping.
Main memory
Block 129 Number of blocks per set is a design
address parameter.
-One extreme is to have all the blocks in
one set,requiring no set bits (fully
Block 4095
associative mapping).
- Other extreme is to have one block per
COA-MODULE -IV-MEMORY set, 888
SYSTEM is the same as direct mapping. 8
Performance
Considerations

COA-MODULE -IV-MEMORY 89
SYSTEM
Performance considerations

 A key design objective of a computer system is to achieve the


best possible performance at the lowest possible cost.
 Price/performance ratio is a common measure of success.
 Performance of a processor depends on:
 How fast machine instructions can be brought into the
processor for execution.
 How fast the instructions can be executed.

COA-MODULE -IV-MEMORY 90
SYSTEM
Interleaving

◾ Divides the memory system into a number of memory


modules. Each module has its own address buffer register
(ABR) and data buffer register (DBR).
◾ Arranges addressing so that successive words in the
address space are placed in different modules.
◾ When requests for memory access involve consecutive
addresses, the access will be to different modules.
◾ Since parallel access to these modules is possible, the
average rate of fetching words from the Main Memory can
be increased.

COA-MODULE -IV-MEMORY 91
SYSTEM
Methods of address layouts

k m
bits bits mbit k
Module Address in MM
module address s bits
Address in Module MM
module address

ABR DBR ABR DBR ABR DBR


ABR DBR ABR DBR ABR DBR
Modul Module Module
e0 i n- 1 Modul Module Modul
e0 i e 2k -
1
◾ Consecutive words are placed • located in
a
in Consecutive words
module.
address determine
◾ High-order the
k bits of a •
are consecutiveConsecutive be
module.
memory modules.
addresses can located in
address determine the •
consecutive modules.
While transferring a
◾ within
word a
Low-order
◾ When
module. am block
bits ofofa words block of data, several memory
memory
transferred
is from main modules can be kept busy at the
to cache, only one module
memory same time.
is busy at a time.

COA-MODULE -IV-MEMORY 92
SYSTEM
Hit Rate and Miss Penalty

 Hit rate
 Miss penalty
 Hit rate can be improved by increasing block size, while
keeping cache size constant
 Block sizes that are neither very small nor very large give
best
results.
 Miss penalty can be reduced if load-through approach is used
when loading new blocks into cache.

COA-MODULE -IV-MEMORY 93
SYSTEM
Caches on the processor chip

 In high performance processors 2 levels of caches are


normally used.
 Avg access time in a system with 2 levels of
caches is T ave = h1c1+(1-h1)h2c2+(1-h1)
(1-h2)M

COA-MODULE -IV-MEMORY 94
SYSTEM
Writing into the cache

When memory write operations are performed, CPU first writes


into the cache memory. These modifications made by CPU
during a write operations, on the data saved in cache, need
to be written back to main memory or to auxiliary memory.
The two popular cache write policies are:

Write –through and Write back

COA-MODULE -IV-MEMORY
995
SYSTEM
5
Other Performance Enhancements

◾ Write-through:
• Each write operation involves writing to the main memory.
• If the processor has to wait for the write operation to be complete, it
slows
down the processor.
• Processor does not depend on the results of the write operation.
• Write buffer can be included for temporary storage of write requests.
• Processor places each write request into the buffer and continues
execution.
• If a subsequent Read request references data which is still in the
write buffer, then this data is referenced in the write buffer.
◾ Write-back:
• Block is written back to the main memory when it is replaced due
to a miss.
• If the processor waits for this write to complete, before reading the
new block, it is slowed down.
• Fast write buffer can hold the block to be written, and the new
block can be read first.
• Dirty bit is set when we write to the cache, this indicates the cache
is now
inconsistent with main memory.
• CO-UNIT-V-MEMORY
Dirty bit for
COA-MODULE cache
-IV-MEMORYslot is cleared when update occurs.
SYSTEM 996
SYSTEM 6
Main advantages:
Write- Back vs Write-
Through
• Write- Back:
• The block can be written by the processor at the frequency at
which the cache, and not the main memory, can accept it.
• Multiple writes to the same block require only a single write to
the main memory.
• Write- Through:
• Simpler to be implemented, but to be effective it requires a w rite
buffer to do , not to wait for the lower level of the memory
hierarchy (to avoid write stalls)
• The read misses are cheaper because they do not require any
write to the
lower level of the memory hierarchy
• Memory always up to date.

COA-MODULE -IV-MEMORY 97
SYSTEM
Other Performance Enhancements (Contd.,)

Prefetching
• New data are brought into the processor when they are first
needed.
• Processor has to wait before the data transfer is complete.
• Prefetch the data into the cache before they are actually needed,
or a
before a Read miss occurs.
• Prefetching can be accomplished through software by
including a special instruction in the machine language of
the processor.
 Inclusion of prefetch instructions increases the length of
the
programs.
• Prefetching can also be accomplished using hardware:
 Circuitry that attempts to discover patterns in memory
references and then prefetches according to this pattern.

COA-MODULE -IV-MEMORY 98
SYSTEM
Other Performance Enhancements (Contd.,)

Lockup-Free Cache
• Prefetching scheme does not work if it stops other
accesses to the cache until the prefetch is completed.
• A cache of this type is said to be “locked” while it services
a miss.
• Cache structure which supports multiple outstanding
misses is called a lockup free cache.
• Since only one miss can be serviced at a time, a lockup free
cache must include circuits that keep track of all the
outstanding misses.
• Special registers may hold the necessary
information about these misses.

COA-MODULE -IV-MEMORY 99
SYSTEM
Cache replacement
policies

COA-MODULE -IV-MEMORY
10
SYSTEM
0
Cache algorithms (Page replacement
policies)
 Replacement algorithms are used when there are no
available space in a cache in which to place a data. Four of
the most common cache replacement algorithms are
described

 First-in, First-out(FIFO): Evict the page that has been in


the cache the longest time.
 Least recently used (LRU): Evict the page whose last
request occurred furthest in the past.(least recently used
page)

 Least Frequently Used (LRU): The LRU algorithm selects for


replacement the item that has been least frequently
used
by the CPU.

 Random: Choose a page at random to evict from the


COA-MODULE
101 cache.-IV-MEMORY 10
SYSTEM 1
New Old block (chosen at
block random)
Random
policy:

New Old block(present


block longest)
FIFO
policy:

Insert time: 8:00 am 7:48am 9:05am 7:10am 7:30 am


10:10am 8:45am
New Old block(least recently
block used)
LRU policy:

last used: 7:25am 8:12am 9:22am 6:50am 8:20am


10:02am 9:50am
COA-MODULE -IV-MEMORY 10
The Random,FIFO,and
SYSTEM
LRU page replacement policies 2
Virtual
Memory

COA-MODULE -IV-MEMORY 10
SYSTEM 3
Virtual memories

◾ Recall that an important challenge in the design of a


computer system is to provide a large, fast memory
system at an affordable cost.
◾ Architectural solutions to increase the effective speed
and
size of the memory system.
◾ Cache memories were developed to increase the
effective speed of the memory system.
◾ Virtual memory is an architectural solution to increase
the effective size of the memory system.

COA-MODULE -IV-MEMORY 104


SYSTEM
Virtual memories (contd..)

◾ Recall that the addressable memory space depends on the


number of address bits in a computer.
 For example, if a computer issues 32-bit addresses,
the addressable memory space is 4G bytes.
◾ Physical main memory in a computer is generally not
as large as the
entire possible addressable space.
 Physical memory typically ranges from a few
hundred megabytes to 1G bytes.
◾ Large programs that cannot fit completely into the main
memory have their parts stored on secondary storage
devices such as magnetic disks.
 Pieces of programs must be transferred to the main
memory from secondary storage before they can be
executed.

COA-MODULE -IV-MEMORY SYSTEM 105


Virtual memories (contd..)

◾ When a new piece of a program is to be transferred to the


main memory, and the main memory is full, then some other
piece in the main memory must be replaced.
 Recall this is very similar to what we studied in case of
cache
memories.
◾ Operating system automatically transfers data between the
main memory and secondary storage.
 Application programmer need not be concerned
with this transfer.
 Also, application programmer does not need to be aware
of the limitations imposed by the available physical
memory.

COA-MODULE -IV-MEMORY SYSTEM 106


Virtual memories (contd..)

◾ Techniques that automatically move program and data


between main memory and secondary storage when they
are required for execution are called virtual-memory
techniques.
◾ Programs and processors reference an instruction or
data independent of the size of the main memory.
◾ Processor issues binary addresses for instructions
and data.
 These binary addresses are called logical or virtual
addresses.
◾ Virtual addresses are translated into physical addresses
by a combination of hardware and software subsystems.
 If virtual address refers to a part of the program that is
currently in the main memory, it is accessed immediately.
 If the address refers to a part of the program that is
not currently in the main memory, it is first transferred
to the main memory before it can be used.
COA-MODULE -IV-MEMORY SYSTEM 107
Virtual memory organization

Processor • Memory management unit


(MMU) translates virtual
Virtual addresses into physical
address
addresses.
Dat MMU • If the desired data or
a
instructions are in the main
Physical memory they are fetched as
address described previously.
Cache • If the desired data or instructions
are not in the main memory, they
Data Physical must be transferred from
address secondary storage to the main
Main memory memory.
• MMU causes the operating system
DMA to bring the data from the
transfer secondary storage into the main
Disk storage
memory.

COA-MODULE -IV-MEMORY 108


SYSTEM
Address translation

◾ Assume that program and data are composed of fixed-length


units called pages.
◾ A page consists of a block of words that occupy
contiguous locations in the main memory.
◾ Page is a basic unit of information that is transferred
between secondary storage and main memory.
◾ Size of a page commonly ranges from 2K to 16K bytes.
 Pages should not be too small, because the access time of
a secondary storage device is much larger than the main
memory.
 Pages should not be too large, else a large portion of the
page may not be used, and it will occupy valuable space
in the main memory.

COA-MODULE -IV-MEMORY 109


SYSTEM
Address translation (contd..)

 Concepts of virtual memory are similar to the concepts of


cache memory.
 Cache memory:
 Introduced to bridge the speed gap between the
processor and
the main memory.
 Implemented in hardware.
 Virtual memory:
 Introduced to bridge the speed gap between the main
memory and secondary storage.
 Implemented in part by software.

COA-MODULE -IV-MEMORY 110


SYSTEM
Address translation (contd..)

◾ Each virtual or logical address generated by a processor is


interpreted as a virtual page number (high-order bits) plus
an offset (low-order bits) that specifies the location of a
particular byte within that page.
◾ Information about the main memory location of each page
is kept in the page table.
 Main memory address where the page is stored.
 Current status of the page.
◾ Area of the main memory that can hold a page is called
as page frame.
◾ Starting address of the page table is kept in a page table
base
register.

COA-MODULE -IV-MEMORY 111


SYSTEM
Address translation (contd..)

 Virtual page number generated by the processor is added


to the contents of the page table base register.
 This provides the address of the corresponding entry in
the page table.
 The contents of this location in the page table give the
starting
address of the page if the page is currently in the main
memory.

COA-MODULE -IV-MEMORY 112


SYSTEM
Address translation (contd..)

PTBR holds Virtual address from


Page table base processor
the address register
of the page Page table Virtual page Offse
address number t
table. Virtual address
is interpreted
+ as page
PAGE TABLE
number and
PTBR + offset.
virtual
page number
provide the This entry has the starting
entry of the location
of the
page in the page.
page table.

Page table holds


information about each
page. This includes the
starting address of the Control
frame bits
Page
in Page frame Offset
page in the main memory

memory.
Physical address in main
memory
COA-MODULE -IV-MEMORY 113
SYSTEM
Address translation (contd..)

◾ Page table entry for a page also includes some control bits
which describe the status of the page while it is in the main
memory.
◾ One bit indicates the validity of the page.
 Indicates whether the page is actually loaded into the
main
memory.
 Allows the operating system to invalidate the page
without actually removing it.
◾ One bit indicates whether the page has been modified
during its residency in the main memory.
 This bit determines whether the page should be written
back to the disk when it is removed from the main memory.
 Similar to the dirty or modified bit in case of cache
memory.

COA-MODULE -IV-MEMORY 114


SYSTEM
Address translation (contd..)

 Other control bits for various other types of restrictions that


may be imposed.
 For example, a program may only have read
permission for a page, but not write or modify
permissions.

COA-MODULE -IV-MEMORY 115


SYSTEM
Address translation (contd..)

◾ Where should the page table be located?


◾ Recall that the page table is used by the MMU for every read
and write access to the memory.
 Ideal location for the page table is within the MMU.
◾ Page table is quite large.
◾ MMU is implemented as part of the processor chip.
◾ Impossible to include a complete page table on the chip.
◾ Page table is kept in the main memory.
◾ A copy of a small portion of the page table can be
accommodated within the MMU.
 Portion consists of page table entries that correspond to
the
most recently accessed pages.

COA-MODULE -IV-MEMORY 116


SYSTEM
Address translation (contd..)

◾ A small cache called as Translation Look aside Buffer


(TLB) is included in the MMU.
 TLB holds page table entries of the most recently
accessed pages.
◾ Recall that cache memory holds most recently accessed
blocks
from the main memory.
 Operation of the TLB and page table in the main
memory is similar to the operation of the cache and
main memory.
◾ Page table entry for a page includes:
 Address of the page frame where the page resides in the
main memory.
 Some control bits.
◾ In addition to the above for each page, TLB must hold the
virtual page number for each page.

COA-MODULE -IV-MEMORY 117


SYSTEM
Address translation (contd..)
Virtual address from
processor

Virtual page Offset Associative-mapped


number
TLB
High-order bits of the
virtual address generated
TLB
by the processor select the
Virtual Contro Page
page
number
l bits frame in
memory
virtual page.
These bits are compared to the
virtual page numbers in the
TLB. If there is a match, a hit
No occurs and the corresponding
=
? address of the page frame is
Yes
read. If there is no match, a
Miss
miss occurs
Hi and the page table within the
t
main memory must
Set-associative mapped
be TLBs
Page frame consulted.
are
Offset found in commercial processors.
Physical address in main
memory
COA-MODULE -IV-MEMORY 118
SYSTEM
Address translation (contd..)

◾ How to keep the entries of the TLB coherent with the


contents of the page table in the main memory?
◾ Operating system may change the contents of the page
table in the main memory.
 Simultaneously it must also invalidate the corresponding
entries in the TLB.
◾ A control bit is provided in the TLB to invalidate an entry.
◾ If an entry is invalidated, then the TLB gets the information
for that entry from the page table.
 Follows the same process that it would follow if the entry
is not
found in the TLB or if a “miss” occurs.

COA-MODULE -IV-MEMORY 119


SYSTEM
Address translation (contd..)

◾ What happens if a program generates an access to a page


that is
not in the main memory?
◾ In this case, a page fault is said to occur.
 Whole page must be brought into the main memory
from the disk, before the execution can proceed.
◾ Upon detecting a page fault by the MMU, following actions
occur:
 MMU asks the operating system to intervene by
raising an exception.
 Processing of the active task which caused the page
fault is interrupted.
 Control is transferred to the operating system.
 Operating system copies the requested page from
secondary
storage to the main memory.
 Once the page is copied, control is returned to the task
which was interrupted.

COA-MODULE -IV-MEMORY 120


SYSTEM
Address translation (contd..)

 Servicing of a page fault requires transferring the requested


page from secondary storage to the main memory.
 This transfer may incur a long delay.
 While the page is being transferred the operating system
may:
 Suspend the execution of the task that caused the page
fault.
 Begin execution of another task whose pages are in the
main memory.
 Enables efficient use of the processor.

COA-MODULE -IV-MEMORY 121


SYSTEM
Address translation (contd..)

 How to ensure that the interrupted task can continue


correctly when it resumes execution?
 There are two possibilities:
 Execution of the interrupted task must continue from
the point
where it was interrupted.
 The instruction must be restarted.
 Which specific option is followed depends on the design
of the processor.

COA-MODULE -IV-MEMORY 122


SYSTEM
Address translation (contd..)

◾ When a new page is to be brought into the main


memory from secondary storage, the main memory may
be full.
 Some page from the main memory must be replaced
with this new page.
◾ How to choose which page to replace?
 This is similar to the replacement that occurs when the
cache is full.
 The principle of locality of reference (?) can also be
applied here.
 A replacement strategy similar to LRU can be applied.
◾ Since the size of the main memory is relatively larger
compared to cache, a relatively large amount of programs
and data can be held in the main memory.
 Minimizes the frequency of transfers between
secondary storage and main memory.

COA-MODULE -IV-MEMORY 123


SYSTEM
Address translation (contd..)

◾ A page may be modified during its residency in the main


memory.
◾ When should the page be written back to the secondary
storage?
◾ Recall that we encountered a similar problem in the
context of cache and main memory:
 Write-through protocol(?)
 Write-back protocol(?)
◾ Write-through protocol cannot be used, since it will incur a
long delay each time a small amount of data is written to
the disk.

COA-MODULE -IV-MEMORY 124


SYSTEM
Memory
Management

COA-MODULE -IV-MEMORY 12
SYSTEM 5
Memory management

 Operating system is concerned with transferring programs


and data between secondary storage and main memory.
 Operating system needs memory routines in addition to the
other routines.
 Operating system routines are assembled into a virtual
address
space called system space.
 System space is separate from the space in which user
application programs reside.
 This is user space.
 Virtual address space is divided into one

system space + several user spaces.

COA-MODULE -IV-MEMORY 126


SYSTEM
Memory management (contd..)

◾ Recall that the Memory Management Unit (MMU) translates


logical or virtual addresses into physical addresses.
◾ MMU uses the contents of the page table base
register to determine the address of the page table to
be used in the translation.
 Changing the contents of the page table base register
can enable us to use a different page table, and switch
from one space to another.
◾ At any given time, the page table base register can point
to one page table.
 Thus, only one page table can be used in the translation
process
at a given time.
 Pages belonging to only one space are accessible

at any given time.

COA-MODULE -IV-MEMORY 127


SYSTEM
Memory management (contd..)

◾ When multiple, independent user programs coexist in the


main memory, how to ensure that one program does not
modify/destroy the contents of the other?
◾ Processor usually has two states of operation:
 Supervisor state.
 User state.
◾ Supervisor state:
 Operating system routines are executed.
◾ User state:
 User programs are executed.
 Certain privileged instructions cannot be executed in
user state.
 These privileged instructions include the ones which
change page table base register.
 Prevents one user from accessing the space of other
users.
COA-MODULE -IV-MEMORY 128
SYSTEM
Secondary
Storage

COA-MODULE -IV-MEMORY 12
SYSTEM 9
Magnetic Hard Disks

Disk

Disk drive

Disk

controller

COA-MODULE -IV-MEMORY 130


SYSTEM
Organization of Data on a Disk

Sector 0,
Sector 3,
track 1
trackn Sector 0,
track 0

Fig.Organization of one surface of a


disk.
COA-MODULE -IV-MEMORY 131
SYSTEM
Access Data on a Disk

 Sector header
 Following the data, there is an error-correction code
(ECC).
 Formatting process
 Difference between inner tracks and outer tracks
 Access time – seek time / rotational delay (latency
time)
 Data buffer/cache

COA-MODULE -IV-MEMORY 132


SYSTEM
Disk Controller

Processor Main memory

System
bus
Disk
controller

Disk drive Disk


drive

Fig. Disks connected to the system


bus.
COA-MODULE -IV-MEMORY 133
SYSTEM
Disk Controller

 Seek
 Read
 Write
 Error
checking

COA-MODULE -IV-MEMORY 134


SYSTEM
RAID Disk Arrays

 Redundant Array of Inexpensive Disks


 Using multiple disks makes it cheaper for huge storage,
and also possible to improve the reliability of the overall
system.
 RAID0 – data striping
 RAID1 – identical copies of data on two disks
 RAID2, 3, 4 – increased reliability
 RAID5 – parity-based error-recovery

COA-MODULE -IV-MEMORY 135


SYSTEM
Optical Disks
(a) Cross-
section

Pit Lan
d

Reflectio
n
Reflectio
n

No
reflection

Source Detect Source Detect Source Detect


or or or
(b) Transition from pit to
land

0 1 0 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 0 1
0

(c) Stored binary pattern

Fig. Optical disk.


COA-MODULE -IV-MEMORY 136
SYSTEM
Optical Disks

 CD-ROM
 CD-Recordable (CD-R)
 CD-ReWritable (CD-
RW)
 DVD
 DVD-RAM

COA-MODULE -IV-MEMORY 137


SYSTEM
Magnetic Tape Systems

File File
mar File
k mar

k

• • 7 or
• • 9
• •
bits

File Recor Recor Recor Recor


gap d d d d
gap gap

Fig.Organization of data on magnetic


tape.

COA-MODULE -IV-MEMORY 138


SYSTEM

You might also like