COMPUTER ARCHITECTURE
MEMORY HIERARCHY DESIGN
Falguni Sinhababu
Government College of Engineering and Leather Technology
1
HIERARCHICAL MEMORY TECHNOLOGY
Memory Hierarchy Design 9-Apr-24 2
Memory is one of the important subsystem of a
computer that determines the overall performance.
Conceptual view of memory:
Array of storage locations, with each storage location having a
unique address
Each storage location can hold a fixed amount of information
(multiple of bits, which is the basic unit of data storage).
A memory system with M locations and N bits per
location, is referred to as an M x N memory.
Both M and N are typically some power of 2.
Example: 1024 x 8, 65536 x 32, etc.
Memory Hierarchy Design 9-Apr-24 3
0 1 2 3 4 5 6 7
0000000000
0000000001
1024 x 8
.
.
.
1111111111
Memory Hierarchy Design 9-Apr-24 4
Bit a single binary digit (0 or 1)
Nibble Collection of 4 bits
Byte Collection of 8 bits
Word Word size" refers to the number
of bits processed by a computer's CPU in one
go (these days, typically 32 bits or 64 bits).
Data bus size, instruction size, address size are
usually multiples of the word size.
Memory Hierarchy Design 9-Apr-24 5
Memory is often byte organized
Every byte of the memory has a unique address
Multiple bytes of data can be accessed by an
instruction
Example: half-word (2 byte), word (4 byte), long word
( 8 bytes)
For higher data transfer rate, memory is often
organized such that multiple bytes can be read or
written simultaneously.
Necessary to bridge the processor – memory speed
gap
Memory Hierarchy Design 9-Apr-24 6
n address lines: The maximum
number of memory locations that
can be accessed is 2n.
m data lines: the number of bits
stored in every addressable
location is m.
The RD/WR’ control line selects
the memory for reading or
writing (1: reading, 0: writing)
The chip select line (CS’) when
active (=0) will enable the chip;
otherwise, the data bus is high
impedance state
Memory Hierarchy Design 9-Apr-24 7
Memory Hierarchy Design 9-Apr-24 8
A computer has 64 MB(megabytes) of byte addressable
memory. How many bits are needed in the memory
address?
Address space = 64 MB = 26 x 220 B = 26 B
If the memory is byte addressable, we need 26 bits of address.
A computer has 1 GB of memory. Each word in this
computer is 32 bits. How bits are needed to address any
single word in memory?
Address space = 1 GB = 230 B
1 word = 32 bits = 4 B
We have 230/4 = 228 words
Thus, we required 28 bits of address each word.
Memory Hierarchy Design 9-Apr-24 9
Many data items require
multiple bytes for storage.
Different computers use
different data ordering
conventions.
Low-order-byte first (Little Endian)
High-order bytes first (Big Endian)
Typical Data Sizes
Thus a 16 bit number 11001100 11001100 10101010
10101010 can be stored as
either: 10101010 11001100
Memory Hierarchy Design 9-Apr-24 10
Little Endian
The least significant byte is stored at lower address
followed by the most significant byte.
Examples: Intel Processors, DEC Alpha, etc.
Same concept followed for arbitrary multi-byte data.
Big Endian
The most significant byte is stored at lower address
followed by the least significant byte
Examples: IBM’s 370 mainframes, Motorola
Microprocessors, TCP/IP, etc.
Same concept followed for arbitrary multi-byte data.
Memory Hierarchy Design 9-Apr-24 11
Represent the following 32-bit number in both Little-
Endian and Big-Endian in memory from address 2000
onwards
01010101 00110011 00001111 11000011
Memory Hierarchy Design 9-Apr-24 12
The program instruction and data stored in memory.
In Von-Neumann Architecture, they are stored in the same
memory.
In Harvard Architecture, they are stored in different memory.
For executing the program two basic operations are
required
Load: The content of a specific memory location is read into a
processor register. LOAD R1, 2000
Store: The contents of processor register is written into a
specific memory location. STORE 2020, R3
Memory Hierarchy Design 9-Apr-24 13
Static RAM:
Very fast but expansive memory technology (require 6 transistors /bit)
Packaging density is limited
Dynamic RAM:
Significantly slower than SRAM, but much less expansive (1 transistor/bit).
Requires periodic refreshing.
Flash memory:
Non-volatile memory technology that uses floating-gate MOS transistors.
Slower than DRAM, but higher packaging density and lower cost per bit.
Magnetic Disk:
Provide large amount of storage, with very low cost per bit.
Much slower than DRAM, and also flash memory.
Requires mechanical moving parts, and uses magnetic recording technology.
Memory Hierarchy Design 9-Apr-24 14
Location
On chip - registers.
On board – main memory.
External – secondary/backup memory.
Capacity – number of bytes/words
Access Method:
Sequential access: serial access - access time depends on the location of the data
Random access: access time is same for every data irrespective of location
Direct access: In this method, individual blocks or records have a unique address based on physical
location. Access is accomplished by direct access to reach a general vicinity plus sequential
searching, counting or waiting to reach the final destination. This method is a combination of above
two access methods.
Performance:
Access Time: time required from beginning to provide the address of the data to the correct data is
available
Transfer Rate = (1/(access time)
Memory Hierarchy Design 9-Apr-24 15
Physical Type:
Semiconductor
Magnetic
Optical
Physical Characteristics:
Volatile – SRAM, DRAM, Z-RAM, TTRAM, A-RAM, ETA RAM
non-volatile – ROM, FeRAM, CBRAM, NRAM, Millipede
Organization (for semiconductor memory)
bit organized (normally for DRAM)
Byte organized (normally for SRAM)
Nibble organized (normally for SRAM)
Memory Hierarchy Design 9-Apr-24 16
Storage capacity in Mb
versus Access time in
sec of different types of
memory shown:
Observation: larger the
capacity slower is the
device
Memory Hierarchy Design 9-Apr-24 17
Observation: higher
capacity, lower cost
Cost decreasing over
the years
Memory Hierarchy Design 9-Apr-24 18
Gap is
increasing
Memory Hierarchy Design 9-Apr-24 19
Memory Latency Reduction Techniques:
Faster DRAM cells (depends on VLSI Technology)
Wider memory bus width (fewer memory accesses
needed)
Multiple memory banks
Integration of memory controller with processor
New emerging RAM technologies
Memory latency hiding techniques
Memory hierarchy (using SRAM-based cache memories)
Prefetching instruction and/or data from memory before
they are actually needed (used to hide long memory
access latency)
Memory Hierarchy Design 9-Apr-24 20
Basic Objectives:
Fast: approaching the
fastest memory
available
Large: approaching the
size of the largest
memory
Level-0: Registers
Optimum cost: close to
Level-1: Cache memory
the cost of cheapest
memory available Level-2: Main memory
Level-3: Secondary memory
Memory Hierarchy Design 9-Apr-24 21
Five Parameters:
Access time (ti): ti-1 < ti
Cost per byte (ci): ci-1 > ci
Memory size (Si): si-1 < si
Transfer bandwidth (bi): bi-1 > bi (i-1)th level
Unit of transfer (xi): xi-1 < xi
ith level
As we go from lower to higher level
Access time increases
Cost per byte decreases
Capacity increases
Frequency of access decreases
Memory Hierarchy Design 9-Apr-24 22
Typical hierarchy (starting with closest to the
processor):
Processor Registers
Level-1 Cache (typically divided into separate instruction and
data cache)
Level-2 Cache
Level-3 Cache
Main Memory
Secondary Memory (magnetic disk/ flash drive)
As we away from the processor:
Size increases
Cost decreases
Speed decreases
Memory Hierarchy Design 9-Apr-24 23
Increasing Size Increasing Cost Increasing Speed
Memory Hierarchy Design 9-Apr-24 24
Memory Hierarchy Design 9-Apr-24 25
Three Properties:
Inclusion Property: It states
that all information
originally stored in the
outermost level Mn. During
the processing, subsets of
Mn are copied into Mn-1.
similarly, subsets of Mn-1
are copied into Mn-2 and so
on.
M1 M2 M3 … M n
Memory Hierarchy Design 9-Apr-24 26
Coherence
Property: It states
that copies of the
same information
item at the
successive memory
levels must be
consistent.
Memory Hierarchy Design 9-Apr-24 27
Locality of reference:
Hennessy and Patterson’s (1990) 90-10 rule states that a typical program may spend
90% of its execution time on only 10% of the code.
For instance, a program may be spending most of its time on nested innermost loop
So, the memory addresses generated by the CPU are actually clustered in certain
regions in time, space and ordering
Three types of locality of reference are
i. Temporal locality: tends to cluster the access in the recently used areas
Eg. Once a loop is entered or a subroutine is called, some part of program will be referred
again and again
ii. Spatial locality: tends to cluster the access of items whose addresses are near to one another
Eg. Operations on arrays; a[0] and a[1] are nearby – shows spatial locality
iii. Sequential locality: means sequential execution of instructions in sequential order or
accessing of an array shows sequential locality of reference
Eg. Accessing a large data array is sequential
Memory Hierarchy Design 9-Apr-24 28
Recently executed instructions are likely to be executed
again very soon
Example: computing factorial of a number
The four instructions in the loop are executed more
frequently than the others.
Memory Hierarchy Design 9-Apr-24 29
Spatial Locality: Instructions residing close to a recently executing instruction are
likely to be executed soon.
Sequential Locality: means sequential execution of instructions in sequential
order or accessing of an array shows sequential locality of reference
Example: excessing elements of an array
Performance can be improved by copying the array into cache memory
Memory Hierarchy Design 9-Apr-24 30
Cost:
Consider a 2-level hierarchy consisting of 2-level of memory say M1 and M2.
Let Ci denote cost per bit of memory Mi and Si denote the storage capacity in bits
of Mi.
The average cost per bit of the memory hierarchy is given by:
c1 S1 c2S2
cos t c
S1 S 2
In order to have c c2, we must ensure S1 << S2.
Memory Hierarchy Design 9-Apr-24 31
Hit Ratio/Hit Rate:
The Hit ratio H is defined as the probability that a logical address
generated by the CPU refers to information stored in M1.CPU is hitting M1,
we say that percentage time the data is found in M1 is the Hit Ratio of M1.
We can determine H experimentally as follows:
A set of representative programs is executed or simulated.
The number of references to M1 and M2, denoted by N1 and N2 respectively,
are recorded. Then Hit Ratio
N1
H
N1 N2
The quantity (1-H) is called the miss ratio.
Memory Hierarchy Design 9-Apr-24 32
Let tA1 and tA2 denote the access times of M1 and M2
respectively, relative to CPU.
The average access time required by the CPU to access
a word in memory can be expressed as:
tA = H.tA1 + (1-H). tA2 (here tA2 = tMISS )
where tMISS denote the time required to handle the
miss, called miss penalty.
Memory Hierarchy Design 9-Apr-24 33
The miss penalty tMISS can be estimated by the following ways:
a) The simplest approach is to set tMISS = tA2, that is, when there is a
miss the data is excessed directly from M2.
b) A request for a word not in M1 typically causes a block containing
the requested word to be transferred from M2 to M1. After
completion of the block transfer, the word can be accessed in M1.
If tB denote the block transfer time, we can write
tMISS = tB + tA1 [since tB >> tA1, tA2 = tB]
Thus tA = H.tA1 + (1-H).(tB + tA1)
c) If tHIT denotes the time required to check whether there is a hit, we
can write
tMISS = tHIT + tB + tA1
Memory Hierarchy Design 9-Apr-24 34
Let r = tA2/tA1 denote the access time ratio of the two
levels of memory.
We define the access efficiency as e = tA1/tA, which
is the factor by which tA (average access time)
defers from its minimum possible value.
t A1 1
Efficiency e
H .t A1 (1 H )t A2 H (1 H ).r
Memory Hierarchy Design 9-Apr-24 35
The speedup gain by using the memory hierarchy is
defined as S = tA2/tA.
We can write:
t A2 1
S
H .t A1 (1 H ).t A2 H / r (1 H )
Same result follows from Amadahl’s law
Memory Hierarchy Design 9-Apr-24 36
Block: the smallest unit of information transferred between two
levels.
Hit Rate: the fraction of memory accesses found in the lower
level.
Hit Time: time to access the lower level
lower level access time + time to determine hit or miss
Miss: data item needed to be retrieved from a block in the upper
level.
Miss Rate: the fraction of memory accesses not found in the lower
level.
Miss Penalty: Overhead whenever a miss occurs.
Time to replace a block in the lower level + time to transfer the misses
block.
Memory Hierarchy Design 9-Apr-24 37
Memory Hierarchy Design 9-Apr-24 38