100% found this document useful (1 vote)
91 views55 pages

Memory Organization AndCache Mapping Study 13

The document discusses memory organization in computer systems, focusing on the types and characteristics of memory, including DRAM and SRAM, and the importance of memory speed for CPU performance. It explains the memory hierarchy, cache operations, and the principles of locality that enhance performance while managing costs. Various mapping techniques for cache memory, such as fully associative, direct-mapped, and set-associative, are also detailed to illustrate how memory blocks are managed within cache lines.

Uploaded by

mailanshjaiswal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
91 views55 pages

Memory Organization AndCache Mapping Study 13

The document discusses memory organization in computer systems, focusing on the types and characteristics of memory, including DRAM and SRAM, and the importance of memory speed for CPU performance. It explains the memory hierarchy, cache operations, and the principles of locality that enhance performance while managing costs. Various mapping techniques for cache memory, such as fully associative, direct-mapped, and set-associative, are also detailed to illustrate how memory blocks are managed within cache lines.

Uploaded by

mailanshjaiswal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Memory Organization

for better performance and cost effective systems

Computer Organization and Architecture, William Stallings


The Big Picture: Where are We Now?
• The Five Classic Components of a Computer
• Memory is usually implemented as:
• Dynamic Random Access Memory (DRAM) - for main memory
• Static Random Access Memory (SRAM) - for cache

Processor
Input
Control
Memory

Datapath Output
Characteristics of memory systems
• Location
• CPU ; registers
• Internal (on board) ; cache and RAM
• External ; HDD
• Capacity
• Word size and number of words (i.e. number of memory locations)
• Access method
• Sequential ; Tape
• Random ; RAM or ROM
• Direct or semi-random ; HDD
Characteristics of memory systems
• Physical type
• Semiconductor ; RAM, ROM
• magnetic ; HDD, FDD
• Optical ; CDs, DVDs
• Physical characteristics
• Volatile and non-volatile
• Erasable and non- erasable
Why should we bother about Memory speed?
Processor-DRAM Memory Gap (latency)

1000 CPU
µProc
60%/yr.
“Moore’s Law”
(2X/1.5yr)
Performance

100 Processor-Memory
Performance Gap:
(grows 50% / year)
10
DRAM
DRAM
9%/yr.
1 (2X/10 yrs)
1985

1988
1980
1981
1982
1983
1984

1986
1987

1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
Time
Characteristics of memory systems
• Performance
• Access time
• For RAM access time is the time between presenting an
address to memory and getting the data on the bus
• Cycle time
• Primarily applied to RAM; access time + additional time before
a second access can start
— memory system needs to recover for the next access
— Function of memory components and system bus, not the processor
Semiconductor RAM
• Static RAM
• Flip-flop based – data is in flip-flops
• Fast
• Bulky
• expensive
• Dynamic RAM
• Capacitor based - data is in form of charge on the capacitor
• Needs refreshing of data
• Compact
• Cheap
• Slower
Memory Hierarchy - Diagram

S C
P O
E S
E T
D
Hierarchy List
• Registers – semiconductor memory on the CPU; fastest
S • Cache – Static RAM placed close to CPU
C
P • Main memory – Dynamic RAM placed on board
O
E • Disk – for bulk data S
E • Optical – for bulk data T
D
• Tape – for bulk data
The Bottom Line
• Performance of CPU depends on access time of memory.
• How to build a system which is having best performance and
at the same time low cost?
• How much?
• Capacity KB , MB, GB, TB ?
• How fast?
• Access / Transfer Rate
• How expensive?
• $$$$$
fast and cost effective system
• Is it possible to build a computer which uses only static RAM (large capacity
of fast memory)?
• This would be a very fast computer
• also very costly

• To reduce cost, can we use the below mentioned configuration?


• Kilo bytes of cache (SRAM)
• Mega bytes of DRAM
• Giga bytes of secondary memory
• This will definitely cost less
• will this give best performance?
Locality of reference
• We know that code should be loaded in main memory for it to execute
• One good observation is that during the course of the execution of a
program, memory references tend to cluster
• e.g. programs - loops, nesting, …
data – strings, lists, arrays, …

• Therefore, answer to the previous question on performance is YES!!!


• WE GET THE BEST PERFORMANCE AT LESS PRICE
• Its all possible only due to ‘ locality of reference’
Principle of locality
• Temporal locality (locality in time): if an item is referenced, it will tend to
be referenced again soon.

• Spatial locality (locality in space): if an item is referenced, items whose


addresses are close by will tend to be referenced soon.

for-loop is temporal locality

array is spatial locality


The Principal of Locality of reference
• How can we use very fast memory effectively?
• Temporal Locality (Close in Time)
• Memory that has been accessed recently is most likely to be accessed again
sooner
• Spatial Locality (Close in location)
• Memory that is close to memory that has been accessed recently is most
likely to be accessed
• Organize memory in blocks
• Keep blocks likely to be used soon in very fast memory
• Keep the next most likely blocks in medium fast memory
• Keep those not likely to be used soon in slower memory
Cache memory
• Small amount of fast memory
• Sits between main memory (RAM) and CPU
• May be located on CPU chip or in system
• Objective is to make slower memory system look like fast memory.
Cache memory

bytes or words lines

processor
cache
RAM
memory
registers tc tm

1 or 2 cycles 50 to 100 cycles


How much would be typical these access time in terms of cycles?
Memory Hierarchy: The Big Picture
Virtual
memory

Main memory

Cache
Registers

Words
Lines
(transferred Pages
explicitly (transferred
via load/store) automatically (transferred
upon cache miss) automatically
upon page fault)

Data movement in a memory hierarchy.


Cache operation – overview
• CPU requests contents of memory location
• Check if this requested data is in cache
• If found in cache, it’s a hit;
• Else, it’s a miss ;
• Cache "misses" are unavoidable, i.e., every piece of data and code must be loaded at
least once
• What does a processor do during a miss?
• It waits for the data to be loaded from RAM to cache.
• Loading is done in two possible ways:
1. read required block from main memory to cache then deliver from cache to CPU (cache
physically between CPU and bus)
2. read required block from main memory to cache and simultaneously deliver to CPU (CPU
and cache both receive data from the same data bus buffer)
Cache Operation
Few terms
• Cache hit rate, h
• number of times the CPU found the required data in cache
• Expressed in percentage
• Cache miss, (1-h)
• number of times the CPU did not find the required data in cache
• Expressed in percentage; typical numbers 3%, 10% etc.
• Average access time, 𝑡ҧ = ℎ ∗ 𝑡𝑐 + 1 − ℎ ∗ 𝑡𝑚 + 𝑡𝑐
𝑡𝑐
• Cache efficiency, Λ =
𝑡ҧ
Cache Structure
• Cache is organized as ‘lines’ of k number of consecutive memory
locations
• Memory is divided into ‘blocks’ of k number of consecutive memory
locations
• ‘tag’ bits for each line of cache;
• tags store info (id) about the corresponding line
Cache Structure
Line
number Tag line
0
1
2

C-1
Block length
(K words)
Memory divided into blocks
Memory
Address
1
2
3 Block of
K words

Block

2n-1

Word length
Cache Structure
Line Memory
number Tag line Address
1
0 2
1 3 Block of
K words
2

Block

C-1 2n-1

Block length Word length


(K words)

how are these lines and blocks linked and tracked?


Block Placement in cache line - Mapping
Where can a block be placed in cache?
We have 3 options as below.
1. Anywhere in cache - fully associative (or associative mapped)
2. In one predetermined place - direct-mapped
3. In a limited set of places - set-associative mapped
• Hybrid of direct mapped and fully associative
Block Placement in cache line - Mapping
• Where can block be placed in cache?
1. Anywhere in cache - fully associative
• Compare tag to every block in cache
2. In one predetermined place - direct-mapped
• Use part of address to calculate block location in cache
• Compare cache block with tag to check if block present
3. In a limited set of places - set-associative
• Hybrid of direct mapped and fully associative
• Use portion of address to calculate set (like direct-mapped)
• Place in any block in the set
• Compare tag to every block in set
Fully-associative Mapping
• A main memory block can load into any line of cache
• Every line's tag must be examined for a match
• Cache searching gets expensive and slower
• Memory address is interpreted as:
• Least significant w bits = word position within block
• Most significant s bits = tag used to identify which block is stored in a
particular line of cache

Tag – s bits Word – w bits


Fully Associative Block Placement
Cache
arbitrary block mapping
location = any We can see that
there are 4 bytes
(32 bits) in each
block.

00 04 08 0C 10 14 18 1C 20 24 28 2C 30 34 38 3C 40 44 48 4C

Memory
Associative Mapping Address Structure Example

Tag – s bits Word – w bits


(22 in example) (2 in ex.)

• Let the address bits be 24 bits (address space?)


• 22 bit tag stored with each 32 bit block of data
• How many blocks are there in this example?
• Compare tag field with tag entry in cache to check for hit
• Least significant 2 bits of address identify which of the four 8 bit
words is required from 32 bit data block
Fully Associative Cache Organization
Associative Mapping Summary

• Address length = (s + w) bits


• Number of addressable locations = 2s+w words or bytes
• Block size = line size = 2w words or bytes
• Number of blocks in main memory = 2s+ w/2w = 2s
• Number of lines in cache = undetermined (can be of any capacity)
• Size of tag = s bits
• Low value cache miss is expected
Direct Mapping Address Structure
Each main memory address (s+w bits) can by divided into three fields
• Least Significant w bits identify unique word within a block
• Remaining bits (s) specify which block in memory. These are divided into two
fields
• Least significant r bits of these s bits identifies which line in the cache
• Most significant s-r bits uniquely identifies the block within a line of the cache

s-r bits r bits w bits

Tag Bits identifying Bits identifying word


row in cache offset into block
Direct Mapping Cache Line Table

number of lines = m
Number of blocks in main memory = 2S

(S is the number of bits required to identify a block in main memory.)

Cache line Main Memory blocks held

0 0, m, 2m, 3m…2s–m
1 1, m+1, 2m+1…2s–m+1

m–1 m–1, 2m–1, 3m–1…2s–1


Direct Mapping
• Each block of main memory maps to only one cache line – i.e. if a
block is in cache, it will always be found in the same place.
• Line number is calculated using the following function
i = j modulo m
where
i = cache line number
j = main memory block number
m = number of lines in the cache
Direct Mapping illustration
Cache

*0 *4 *8 *C

00 04 08 0Ctttttttttttttttttttttttttttttttt
10 14 18 1Ctttttttteee
20 24 28 2C 30 34 38 3C 40 44 48 4C

Memory
Direct mapping - illustration
Consider, main memory address of 8 bits and block size of 4 bytes
• 256 bytes of main memory
• 64 blocks in main memory

Consider, 8 lines cache, each has a length of 4 bytes


• 32 bytes of cache
• Need 3 bits to identify a line in the cache
• Need 2 bits to identify a byte in a line
tag line w
3 3 2
A look at cache
Line Tag Cache line Block numbers that
number (3 bits) can sit in each line

0 xyz 4 bytes of data in each line Pre determined 8 blocks can sit here

1 pqr Pre determined 8 blocks can sit here


2 abc


3
….
Pre determined 8 blocks can sit here
7
Direct mapping- illustration
memory address (s+w bits)
tag line w S bits specify which block in memory.
Least significant r bits of these s bits identifies which
3 3 2 line in the cache

0 0 0 0 0 0 0 0 00h Block 0, line 0 00h


0 0 0 0 0 1 0 0 04h Block 1, line 1 04h
0 0 0 0 1 0 0 0 08h Block 2, line 2 08h
0 0 0 0 1 1 1 0 0Eh Block 3, line 3 0Eh
0 0 1 0 1 0 0 0 28h Block 10, line 2 28h
0 0 1 1 0 0 0 0 30h Block 12, line 4 30h
0 1 1 1 1 0 0 0 78h Block 30, line 6 78h
0 1 1 1 1 1 0 0 7Ch Block 31, line 7 7Ch
1 1 1 1 1 1 1 1 FFh Block 63, line 7 FFh
Direct Mapping Address Structure Example
Tag s-r Line or slot r Word w
8 14 2

• 24 bit address
• 2 bit word identifier (4 byte block)
• 22 bit block identifier
• 8 bit tag
• 14 bit slot or line
• No two blocks in the same line have the same tag
• Check contents of cache by finding line and comparing tag
Direct mapping
Direct Mapping Cache Organization
Direct Mapping Summary

• Address length = (s + w) bits


• Number of addressable units = 2s+w words or bytes
• Block size = line width = 2w words or bytes
• Number of blocks in main memory = 2s+ w/2w = 2s
• Number of lines in cache = m = 2r
• Size of tag = (s – r) bits
Direct Mapping pros & cons
• Simple
• Inexpensive
• Fixed location for given block – If a program accesses 2 blocks that
map to the same line repeatedly, cache misses are very high
(thrashing)
Set Associative Mapping

• Address length is s + w bits


• Cache is divided into a number of sets, v = 2d
• k blocks (or lines) can be contained within each set
• k lines in a cache is called a k-way set associative mapping
• Number of lines in a cache = v•k = k•2d
• Size of tag = (s-d) bits
• It is viewed as many set of associative mapping.
Set Associative Mapping

• Hybrid of Direct and Associative


k = 1, this is direct mapping
v = 1, this is associative mapping
• Each set contains a number of lines, basically the total number of lines in cache
divided by the number of sets
• A given block maps to any line within its specified set – e.g. Block B can be in any
line of set i.
• 2 lines per set is the most common organization.
• Called 2-way associative mapping
• A given block can be in one of 2 lines in only one specific set
• Significant improvement over direct mapping
Set-Associative Block Placement
Cache

*0 *0 *4 *4 *8 *8 *C *C

Set 0 Set 1 Set 2 Set 3

00 04 08 0C 10 14 18 1C 20 24 28 2C 30 34 38 3C 40 44 48 4C

Memory
K-Way Set Associative Cache Organization
Previous direct mapping example- will be
used for next mapping illustration
• 24 bit address lines
• 2 bit word identifier (4 byte block)
• 22 bit block identifier
• 8 bit tag
• 14 bit slot or line
• No two blocks in the same line have the same tag
• Check contents of cache by finding line and comparing tag
2 way set associative mapping
• Divides the 16K lines into 2 sets of 8k with 2 lines in each set
• This requires a 13 bit set number
• With 2 word bits, this leaves 9 bits for the tag
• Blocks beginning with the addresses 00000016, 00800016, 01000016,
01800016, 02000016, 02800016, etc. map to the same set, Set 0.
• Blocks beginning with the addresses 00000416, 00800416, 01000416,
01800416, 02000416, 02800416, etc. map to the same set, Set 1.

Tag Set Word


9 bits 13 bits 2 bits
Set Associative Mapping Address Structure

Tag Set Word


9 bits 13 bits 2 bits

• Note that there is one more bit in the tag than for this same
example using direct mapping.
• Therefore, it is 2-way set associative
• Use set field to determine cache set to look into
• Compare tag field to see if we have a hit
Direct Mapping Exercise
What cache line number will the following addresses be stored at?
what will the address range of each block they belong to? Consider a
cache with 4K lines of 16 words to a block in a 256 Meg memory
space. (how many address bits?)
a) 9ABCDEF16
b) 123456716

Tag s-r Line or slot r Word w


12 bits 12 bits 4 bits
Cache line number: a) CDEh b) 456h
Address range: (a) 9ABCDE0h to 9ABCDEFh (b) 1234560h to 123456Fh
Direct Mapping Exercise
Show the cache memory with tag bits and line number (in binary) which
will be occupied by the following addresses? Consider a cache with 4K
lines of 16 words to a block in a 256 Meg memory space.
How many lines do
a) 8ABCDAF16 we have in total?
b) 81A345716 What is the size of
cache in bytes?
Tag s-r Line or slot r Word w
12 bits 12 bits 4 bits If next address is 81A3458H,
will this be a hit or miss?

Ans: a) tag bits: 1000 1010 1011 corresponding to line no. 1100 1101 1010
b) tag bits: 1000 0001 1010 corresponding to line no. 0011 0100 0101
Direct Mapping Exercise
Assume that a portion of the cache showing tags and line number looks like the
table below. Which of the following addresses are contained in the cache?
a) 438EE816 b) F18EFF16 c) 6B8EF316 d) AD8EF316
miss hit miss hit

Tag (binary) Line number (binary) Addresses wi/ block


00 01 10 11
0101 0011 1000 1110 1110 10
1110 1101 1000 1110 1110 11
1010 1101 1000 1110 1111 00
0110 1011 1000 1110 1111 01
1011 0101 1000 1110 1111 10
1111 0001 1000 1110 1111 11
Fully Associative Mapping Exercise
Assume that a portion of the cache with the tag values looks like the table below.
Which of the following addresses are contained in the cache?

a) 438EE816 b) F18EFF16 c) 6B8EF316 d) AD8EF316

miss hit miss hit


Tag (binary) Addresses wi/ block
00 01 10 11
0101 0011 1000 1110 1110 10
1110 1101 1100 1001 1011 01
1010 1101 1000 1110 1111 00
0110 1011 1000 1110 1111 11
1011 0101 0101 1001 0010 00
1111 0001 1000 1110 1111 11
Set Associative Mapping Exercise
For each of the following addresses, answer the following questions based on a 2-way set
associative cache with a total of 4K lines, each line containing 16 words, with the main memory of
size 256 Meg memory space:
a) What is the cache set number the identified block be associated with?
b) What will the tag be?
c) What will the address range of the block in which the given address be located?
1. 9ABCDEF16
2. 123456716 Ans.: for 9ABCDEFH
set number is : 10011011110
tag: 1001101010111
9ABCDE0H to 9ABCDEFH
Tag s-r Set s Word w For 1234567H
set number is : 10001010110
13 11 4 tag: 0001001000110
1234560H to 123456FH

You might also like