02/03/2019
Chapter 3
COMPUTER MEMORY
Part 3
Internal memory
Basic Principles of Computers
Virtually all modern computer
designs are based on the von
Neumann architecture principles:
Data and instructions are stored in a
single read/write memory.
The contents of this memory are
addressable by location, without
regard to what are stored there.
Instructions are executed sequentially
(from one instruction to the next)
unless the order is explicitly modified
1
02/03/2019
Many Different Technologies
Internal and External Memories
2
02/03/2019
Main Memory Model
Byte-Oriented Memory Organization
Conceptually, memory is a single, large array of bytes,
each with a unique address (index)
The value of each byte in memory can be read and written
Programs refer to bytes in memory by their addresses
Domain of possible addresses = address space
But not all values fit in a single byte… (e.g. 410)
Many operations actually use multi-byte values
We can store addresses as data to “remember” where other data is
in memory
•••
3
02/03/2019
Word-Oriented Memory Organization
64-bit 32-bit Addr.
Addresses still specify Bytes
Words Words (hex)
locations of bytes in memory 0x00
Addr
Addresses of successive words =
0x01
differ by word size (in bytes): 0000
?? 0x02
Addr
e.g. 4 (32-bit) or 8 (64-bit) =
0x03
Address of word 0, 1, … 10? 0000
?? 0x04
Addr
=
0x05
Address of word 0004
?? 0x06
= address of first byte in word 0x07
The address of any chunk of 0x08
Addr
memory is given by the address =
0x09
of the first byte Addr
0008
?? 0x0A
Alignment = 0x0B
0008
?? 0x0C
Addr
=
0x0D
0012
?? 0x0E
Byte Ordering
How should bytes within a word be ordered in memory?
Example: store the 4-byte (32-bit) int:
0x a1 b2 c3 d4
By convention, ordering of bytes called endianness
The two options are big-endian and little-endian
Based on Gulliver’s Travels: tribes cut eggs on different sides
(big, little)
4
02/03/2019
Big-endian (SPARC, z/Architecture)
Least significant byte has highest address
Little-endian (x86, x86-64)
Least significant byte has lowest address
Bi-endian (ARM, PowerPC)
Endianness can be specified as big or little
Example: 4-byte data 0xa1b2c3d4 at address 0x100
0x100 0x101 0x102 0x103
Big-Endian 01
a1 23
b2 45
c3 67
d4
0x100 0x101 0x102 0x103
Little-Endian 67
d4 45
c3 23
b2 01
a1
Decimal: 12345
Binary: 0011 0000 0011 1001
Byte Ordering Examples Hex: 3 0 3 9
5
02/03/2019
Traditional Bus Structure Connecting CPU and Memory
A bus is a collection of parallel wires that carry
address, data, and control signals.
Buses are typically shared by multiple devices.
CPU chip
Register file
ALU
System bus Memory bus
I/O Main
Bus interface
bridge memory
Memory Read Transaction (1)
CPU places address A on the memory bus.
Register file Load operation: movq A, %rax
ALU
%rax
Main memory
I/O bridge 0
A
Bus interface
x A
6
02/03/2019
Memory Read Transaction (2)
Main memory reads A from the memory bus,
retrieves word x, and places it on the bus.
Register file Load operation: movq A, %rax
ALU
%rax
Main memory
I/O bridge x 0
Bus interface x A
Memory Read Transaction (3)
CPU read word x from the bus and copies it into
register %rax.
Register file Load operation: movq A, %rax
ALU
%rax x
Main memory
I/O bridge 0
Bus interface
x A
7
02/03/2019
Memory Write Transaction (1)
CPU places address A on bus. Main memory
reads it and waits for the corresponding data
word to arrive.
Register file Store operation: movq %rax, A
ALU
%rax y
Main memory
I/O bridge 0
A
Bus interface A
Memory Write Transaction (2)
CPU places data word y on the bus.
Register file Store operation: movq %rax, A
ALU
%rax y
Main memory
I/O bridge 0
y
Bus interface
A
8
02/03/2019
Memory Write Transaction (3)
Main memory reads data word y from the bus
and stores it at address A.
Register file Store operation: movq %rax, A
ALU
%rax y
main memory
I/O bridge 0
Bus interface y A
Physical types
Semiconductor
RAM
Magnetic
Disk & Tape
Optical
CD & DVD
Others
Bubble
Hologram
9
02/03/2019
Storing data in main memory
Possible types of memories:
ROM: read-only
Classical ROM: the content is stored during the manufacturing process
PROM: one-time programmable
EPROM: can be erased using ultraviolet light
Etc.
SRAM: Static Random Access Memory
Can be read and modified any time
It preserves data while power supply is present
DRAM: Dynamic Random Access Memory
Can be read and modified any time
It forgets its content! Needs to be refreshed periodically
Nonvolatile Memories
DRAM and SRAM are volatile memories
Lose information if powered off.
Nonvolatile memories retain value even if powered off
Read-only memory (ROM): programmed during production
Programmable ROM (PROM): can be programmed once
Eraseable PROM (EPROM): can be bulk erased (UV, X-Ray)
Electrically eraseable PROM (EEPROM): electronic erase capability
Flash memory: EEPROMs. with partial (block-level) erase capability
Wears out after about 100,000 erasings
Uses for Nonvolatile Memories
Firmware programs stored in a ROM (BIOS, controllers for disks, network
cards, graphics accelerators, security subsystems,…)
Solid state disks (replace rotating disks in thumb drives, smart phones, mp3
players, tablets, laptops,…)
Disk caches
10
02/03/2019
Random-Access Memory (RAM)
Key features
RAM is traditionally packaged as a chip
Basic storage unit is normally a cell (one bit per cell)
Multiple RAM chips form a memory
Static RAM (SRAM)
Each cell stores a bit with a four- or six-transistor circuit
Retains value indefinitely, as long as it is kept powered
Relatively insensitive to electrical noise (EMI), radiation, etc.
Faster and more expensive than DRAM
Dynamic RAM (DRAM)
Each cell stores bit with a capacitor; transistor is used for access
Value must be refreshed every 10-100 ms
More sensitive to disturbances (EMI, radiation,…) than SRAM
Lower and cheaper than SRAM
SRAM vs DRAM Summary
EDC = error detection and correction
To cope with noise, etc.
11
02/03/2019
How to create a memory system for the computer
It must be: cheap, large, low latency, high throughput
DRAM
Bits stored as charge in capacitors
Charges leak
Need refreshing even when powered
Simpler construction
Smaller per bit
Less expensive
Need refresh circuits
Slower
Main memory
Essentially analogue
Level of charge determines value
12
02/03/2019
Static RAM
Bits stored as on/off switches
No charges to leak
No refreshing needed when powered
More complex construction
Larger per bit
More expensive
Does not need refresh circuits
Faster
Cache
Digital
Uses flip-flops
SRAM v DRAM
Both volatile
Power needed to preserve data
Dynamic cell
Simpler to build, smaller
More dense
Less expensive
Needs refresh
Larger memory units
Static
Faster
Cache
13
02/03/2019
Summary: DRAM vs. SRAM
DRAM (Dynamic RAM) SRAM (Static RAM)
Used mostly in main mem. Used mostly in caches (I, D,
Capacitor + 1 transistor/bit TLB, BTB)
Need refresh every 4-8 ms 1 flip-flop (4-6 transistors)
5% of total time per bit
Read is destructive (need for Read is not destructive
write-back) Access time = cycle time
Access time < cycle time Speed (8-16):1 to DRAM
(because of writing back) Address lines not multiplexed
high speed of decoding imp.
Density (25-50):1 to SRAM
Address lines multiplexed
pins are scarce!
Chip Organization
Chip capacity (= number of data bits)
tends to quadruple
1K, 4K, 16K, 64K, 256K, 1M, 4M, …
In early designs, each data bit belonged to a different
address (x1 organization)
Starting with 1Mbit chips, wider chips (4, 8, 16, 32 bits
wide) began to appear
Advantage: Higher bandwidth
Disadvantage: More pins, hence more expensive packaging
14
02/03/2019
DRAM bank
Structure:
DRAM cells in a 2D grid
Each cell in a row shares
the same word line
Each cell in a column
shares the same bit line
Reading:
The row decoder selects (activates) a row
The sense amplifiers detect and store the bits of the row
The column multiplexer selects the desired column from the row
Two-phase operations:
To reduce the width of the address bus
Address bus: row address → wait → address bus: column
address→ data bus: the desired data
16 X 1 as 4 X 4 Array
Two decoders
Row
Column
Address just broken
up
Not visible from
outside
15
02/03/2019
DRAM Logical Diagram
Conventional DRAM Organization
d x w DRAM:
dw total bits organized as d supercells of size w bits
16
02/03/2019
Reading DRAM Supercell (2,1)
Step 1(a): row access strobe (RAS) selects row 2
Step 1(b): row 2 copied from DRAM array to row buffer
Reading DRAM Supercell (2,1)
Step 2(a): column access strobe (CAS) selects column 1
Step 2(b): supercell (2,1) copied from buffer to data
lines, and eventually back to the CPU
17
02/03/2019
Memory Modules
Combine some of chip to create memory module
Enhanced DRAMs
Basic DRAM cell has not changed since its invention in
1966
Commercialized by Intel in 1970
DRAMs with better interface logic and faster I/O:
Synchronous DRAM (SDRAM)
Uses a conventional clock signal instead of asynchronous control
Allows reuse of the row addresses (e.g., RAS, CAS, CAS, CAS)
Double data-rate synchronous DRAM (DDR SDRAM)
DDR1 : twice as fast
DDR2 : four times as fast
DDR3 : eight times as fast
18
02/03/2019
Enhanced DRAMs
DRAM chips
19
02/03/2019
Organisation in detail
A 16Mbit chip can be organised as 1M of 16 bit
words
A bit per chip system has 16 lots of 1Mbit chip
with bit 1 of each word in chip 1 and so on
A 16Mbit chip can be organised as a 2048 x 2048
x 4bit array
Reduces number of address pins
Multiplex row address and column address
11 pins to address (211=2048)
Adding one more pin doubles range of values so x4 capacity
Typical 16 Mb DRAM (4M x 4)
20
02/03/2019
Packaging
DRAM MEMORY MODULE
A memory module consists of DRAM chips
Command lines, bank selection lines, address lines:
shared
Data lines: concatenated
Each chip receives all commands
Effect:
Throughput increases 8x
Delay: the same
21
02/03/2019
Memory Interleaving
Goal: Try to take advantage of bandwidth of multiple
DRAMs in memory system
Memory address A is converted into (b,w) pair, where
b = bank index
w = word index within bank
Logically a wide memory
Accesses to B banks staged over time to share internal resources
such as memory bus
Interleaving can be on
Low-order bits of address (cyclic)
b = A mod B, w = A div B
High-order bits of address (block)
Combination of the two (block-cyclic)
Low-order Bit Interleaving
22
02/03/2019
Mixed Interleaving
Memory address register is 6 bits wide
Most significant 2 bits give bank address
Next 3 bits give word address within bank
LSB gives (parity of) module within bank
6 = 0001102 = (00, 011, 0) = (0, 3, 0)
41 = 1010012 = (10, 100, 1) = (2, 4, 1)
23