Module 9
Module 9
Semester 2/2024
Contents
1 Module 9 - Memory 2
1.0.1 Learning Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1 Memory Architecture and Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Memory Map Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Volatile Versus Non-volatile Memory . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 Read-Only Versus Read/Write Memory . . . . . . . . . . . . . . . . . . . . . . 3
1.1.4 Random Access Versus Sequential Access . . . . . . . . . . . . . . . . . . . . . 3
1.2 Non-volatile Memory Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 ROM Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Mask Read-Only Memory (MROM) . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.3 Programmable Read-Only Memory (PROM) . . . . . . . . . . . . . . . . . . . 8
1.2.4 Erasable Programmable Read-Only Memory (EPROM) . . . . . . . . . . . . . . 10
1.2.5 Electrically Erasable Programmable Read-Only Memory (EEPROM) . . . . . . 12
1.2.6 FLASH Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.6.1 Flash Memory Cell: Floating-Gate MOSFET Transistor . . . . . . . . 14
1.2.6.2 NOR Flash Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2.6.3 NAND Flash Memory . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.2.6.4 NOR vs NAND Flash Memory . . . . . . . . . . . . . . . . . . . . . 20
1.3 Volatile Memory Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.3.1 Static Random-Access Memory (SRAM) . . . . . . . . . . . . . . . . . . . . . 21
1.3.1.1 Differential amplifier [Will not be included in final exam] . . . . . . . 23
1.3.2 Dynamic Random-Access Memory (DRAM) . . . . . . . . . . . . . . . . . . . 27
1.3.2.1 Charge Pump [Will not be included in final exam] . . . . . . . . . . . 29
1.4 Modeling Memory with Verilog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.4.1 Read-Only Memory in Verilog . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.4.2 Read/Write Memory in Verilog . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.5 Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
1
Chapter 1
Module 9 - Memory
This week introduces the basic concepts, terminology, and roles of memory in digital systems. The ma-
terial presented here will not delve into the details of the device physics or low-level theory of operation.
Instead, the intent of this week lecture notes is to give a general overview of memory technology and
its use in computer systems in addition to how to model memory in Verilog. The goal is to give an
understanding of the basic principles of semiconductor-based memory systems.
2
has N = 8, this means that 8-bits of data are stored at each address. The number of address locations is
described using the variable M. The overall size of the memory is typically stated by saying "M × N". For
example, if we had a 16 × 8 memory system, that means that there are 16 address locations, each capable
of storing a byte of data. This memory would have a capacity of 16 × 8 = 128 bits. Since the address
is implemented as a binary code, the number of lines in the address bus (n) will dictate the number of
address locations that the memory system will have (M=2n ). Fig. 1.1 shows a graphical depiction of
how data resides in memory. This type of graphic is called a memory map model.
3
term to describe volatile memory. When describing modern memory systems, the terms RAM and ROM
are used most commonly to describe the characteristics of the memory being used; however, modern
memory systems can be both read/write and non-volatile, and the majority of memory is random access.
In-class Question 1: An 8-bit wide memory has eight address lines. What is its capacity in bits?
A) 64
B) 256
C) 1024
D) 2048
4
Figure 1.2: Basic architecture of read-only memory (ROM).
Figure 10.3 shows the operation of a ROM when information is being read.
5
Figure 1.3: ROM operation during a read.
6
Figure 1.4: Asynchronous vs. synchronous ROM operation during a read cycle.
7
Figure 1.5: MROM overview.
8
Figure 1.6: PROM overview.
Fig. 1.7 shows a PROM programmer and also a PROM memory device.
9
Figure 1.7: Data IO 29B Universal PROM Programmer.
10
Figure 1.8: Floating-gate transistor - Programming.
In order to change the floating-gate transistor back into its normal state, the device is exposed to a
strong ultraviolet light source. When the UV light strikes the trapped charge in the secondary oxide, it
transfers enough energy to the charge particles that they can move back into the metal plates in the gate.
This, in effect, erases the device and restores it back to a state with a high threshold voltage. EPROMs
contain a transparent window on the top of their package that allows the UV light to strike the devices.
The EPROM must be removed from its system to perform the erase procedure. When the UV light
erase procedure is performed, every device in the memory array is erased. EPROMs are a significant
improvement over PROMs because they can be programmed multiple times; however, the programming
and erase procedures are manually intensive and require an external programmer and external eraser.
Fig. 1.9 shows the erase procedure for a floating-gate transistor using UV light.
11
Figure 1.9: Floating-gate transistor - Erasing with UV light.
An EPROM array is created in the exact same manner as in a PROM array with the exception that ad-
ditional programming circuitry is placed on the IC and a transparent window is included on the package
to facilitate erasing. An EPROM is non-volatile and read-only since the programming procedure takes
place outside of its destination system.
Fig. 1.10 shows an EPROM IC.
12
Figure 1.11: Floating-gate transistor - Erasing with electricity.
Early EEPROMs were very slow and had a limited number of program/erase cycles; thus they were
classified into the category of non-volatile, read-only memory. Modern floating-gate transistors are now
capable of access times on scale with other volatile memory systems; thus they have evolved into one of
the few non-volatile, read/write memory technologies used in computer systems today.
13
Figure 1.12: Flash memory packaging variations.
Flash memory technology has evolved in two major directions. In the early years of Flash, there
was only a single type of Flash, NOR Flash. NOR Flash has a random-access memory cell optimized
for high-speed applications such as the cell phone and other code storage applications. A serial-based
NAND Flash technology was created to meet the emerging needs of the low-cost file storage market
that can live with a slow serial read access time. The primary NAND Flash applications are the digital
camera and the portable MP3 music players.
To understand the flash memory we should first look at a single flash memory cell which is basically
a floating-gate MOSFET transistor.
14
Figure 1.13: A flash memory cell.
A flash memory cell in its default uncharged state holds binary value 1. To program the cell means to
store binary value 0 into by charging it (put electric charges on FG) and to erase a cell means to perform
the opposite operation and remove the charges off the FG and set its stored value back to 1. Finally, to
read a cell value we need to sense current that can pass through it when we apply a certain voltage level
to CG.
In order to read a value from the cell, an intermediate voltage (VI ) between VT1 and VT2 is applied
to the CG. If the channel conducts at VI , the FG must be uncharged (if it were charged, there would not
be conduction because VI is less than VT2 ). If the channel does not conduct at the VI , it indicates that
the FG is charged. The binary value of the cell is sensed by determining whether there is current flowing
through the transistor when VI is asserted on the CG. In a multi-level cell device, which stores more than
one bit per cell, the amount of current flow is sensed (rather than simply its presence or absence), in
order to determine more precisely the level of charge on the FG.
15
In order to write value 0 to the cell we have to move electrons from the control gate into the floating
gate. This process is called Fowler–Nordheim tunneling, and it fundamentally changes the charac-
teristics of the cell by increasing the MOSFET’s threshold voltage (VTH ). This, in turn, changes the
drain-source current that flows through the transistor for a given gate voltage, which is ultimately used
to encode a binary value 0 (Or 1). The Fowler-Nordheim tunneling effect is reversible, so electrons can
be added to or removed from the floating gate, processes traditionally known as writing and erasing.
To recap: To write 0 we move electrons into FG, To write 1 we remove the electrons off FG (erasing)
and to read we sense the drain-source current.
After understanding the operation of a flash memory cell we can see that it only is able to store just
one bit of data 0 or 1. Therefore, we need put a group of these cells together to store string of bits. We
will briefly look into the architecture of NAND and NOR flash memories which are strings of array of
the floating-gate MOSFETs.
Fig. 1.14 shows schematic of NMOS and resistor NAND and NOR gates which is placed here to
remind us the transistor-level (NMOS + resistor) operation of these gates.
16
1. Read (Sensing): Apply 0v to all Word lines except Word Line 3. Apply VI (e.g., +4v) to Word
Line 3. Connect Source Line to the ground. Apply VCC to BL, then Bit Line (BL) output will be a
function of the transistor state connected to Word Line 3:
• If FG is uncharged, then the channel conducts and the ground will be connected to the BL.
(Ground appearing on BL indicates a read binary value 1)
• If FG is charge (memory cell is programmed to hold binary value 0 by applying an elevated
on-voltage to the CG), then the channel does not conduct and VCC will be connected to the
BL. (VCC appearing on BL indicates a read binary value 0)
2. Programming (Storing 0): Apply programming voltage (is higher voltage level than binary 1,
e.g., +10v) to Word Line 3 CG. Connect Source Line to the ground. Apply programming drain
voltage to BL (e.g., +5v).
3. Erasing (Storing 1): Apply negative programming voltage (e.g., -10v) to Word Line 3 CG. Con-
nect Source Line to positive erase voltage (e.g., +3v). Leave the BL floating.
17
1. Read (Sensing): Apply VCC to all Word lines except Word Line 3. Apply VI (e.g., +4v) to Word
Line 3. Apply VCC to GST to turn it on. Apply VCC to BLT to turn it on. Apply VCC to BL, then
Bit Line (BL) output will be a function of the transistor state connected to Word Line 3:
• If FG is uncharged, then the channel conducts and the ground will be connected to the BL.
(Ground appearing on BL indicates a read binary value 1)
• If FG is charge (memory cell is programmed to hold binary value 0 by applying an elevated
on-voltage to the CG), then the channel does not conduct and VCC will be connected to the
BL. (VCC appearing on BL indicates a read binary value 0)
2. Programming (Storing 0 or 1): Apply programming voltage (higher voltage level than binary
1 voltage or VCC , e.g., +19v) to Word Line 3’s CG. Apply programming deselect voltage (e.g.,
+12v) to other word lines’ CG. Apply VCC (e.g., +5v) to BLT to turn it on. Apply 0v to GST to
turn it off. Apply VCC (e.g., +5v) to Source Line. Apply 0v to P-well. The BL is at 0v if the data
to be programmed is logic 0 or at VCC if the data is logic 1.
3. Erasing (Storing 1): Apply 0V to all the word lines. Apply 0v to BLS and set GST and BL to
floating. Apply elevated erasing voltage to P-well (e.g. +20v). Not that as we can see here, NAND
string can be erased in bulk.
The structure in Fig. 1.16 can store 8 bits in eight separate floating gate transistors. This is a byte
array - it contains an eight bit byte spread across eight transistors. Basically, to read data, first the desired
group is selected (in the same way that a single transistor is selected from a NOR array). Next, most
of the word lines are pulled up above VT2 , while one of them is pulled up to VI . The series group will
conduct (and pull the bit line low) if the selected bit has not been programmed.
A NAND flash chip (or "package") is divided into the following entities as can be seen in Fig. 1.17:
the die, the plane, the block and the page. Blocks are the smallest unit that can be erased, and pages are
the smallest unit that can be programmed.
18
Figure 1.17: NAND Flash Die Layout.
A page is a collection of Bit lines attached on a single Word line. One word-line can contain one or
more pages. One page typically consists of 256 or 512 bytes of memory (2048 or 4096 bits). Fig. 1.18
shows a group of Bit lines that form a page. if a page size has 32,768 NAND Cells (bits), this equates to
4096 bytes or 4K Page size.
Figure 1.18: NAND Page marked with yellow rectangle - The whole image is a NAND block.
19
1.2.6.4 NOR vs NAND Flash Memory
Table 1.1 show the comparison of NOR versus NAND flash memories.
In both NOR and NAND Flash, the memory is organized into erase blocks. This architecture helps
maintain lower cost while maintaining performance. For example, a smaller block size enables faster
erase cycles. The downside of smaller blocks, however, is an increase in die area and memory cost.
NAND architecture enables placement of more cells in a smaller area compared to the NOR architec-
ture. For similar process technology, the physical design of NAND flash cells allows for approximately
40% less area coverage than NOR flash cells. The lower cost per bit also contributes to the higher density
of NAND memory devices.
NOR flash memory enables faster read operations because the cells are wired in parallel. Instead of
reading out an entire word line into a register and parsing it, an individual bit can be accessed as needed.
However, this makes NOR flash slower to write and erase than NAND because of its greater cell size; in
NOR flash, each bit must be written to a 0 before it can be erased.
The memory cells in a NAND are joined in series and arranged into pages, which are further divided
into blocks. NAND reads are slower, but NAND writes and erases are faster than NOR because large
groups of bits can be erased simultaneously.
NOR draws a higher current when turned on, but it uses less power when idle. While NAND has
a lower initial power consumption, its standby power consumption is higher. The instantaneous active
power of NOR and NAND flash memories is almost similar, so the total energy consumption will depend
on the amount of time the memory is actively being read or written.
The limitation of NAND FLASH was that reading and writing could only be accomplished in a
block-by-block basis. This characteristic precluded the use of NAND FLASH for run-time variables and
20
data storage but was well suited for streaming applications such as audio/video and program loading. As
NAND FLASH technology advanced, the block size began to shrink, and software adapted to accom-
modate the block-by-block data access. This expanded the applications that NAND FLASH could be
deployed in. Today, NAND FLASH memory is used in nearly all portable devices (e.g., smartphones,
tablets, etc.), and its use in solid-state hard drives (SSD) is on pace to replace hard disk drives and
optical disks as the primary non-volatile storage medium in modern computers.
NOR FLASH is considered random-access memory, while NAND FLASH is typically not; however,
as the block size of NAND FLASH is continually reduced, its use for variable storage is becoming more
attractive. All FLASH memory is non-volatile and read/write.
The chart in Fig. 1.19 shows various NOR and NAND density ranges to help identify the best solution
for your application requirements.
Figure 1.19: Various densities of NOR versus NAND Micron flash memories.
In-class Question 2: Which of the following is suitable for implementation in a read-only memory?
21
cell. In this configuration, two access transistors (M1 and M2) are used to read and write from the
storage cell. The cell has two complementary ports called bit line (BL) and bit line’ (BLn). Due to
the inverting functionality of the feedback loop, these two ports will always be the complement of each
other. This behavior is advantageous because the two lines can be compared to each other to determine
the data value. This allows the voltage levels used in the cell to be lowered while still being able to
detect the stored data value. Word lines are used to control the access transistors. This storage element
takes six CMOS transistors to implement and is often called a 6T configuration. The advantage of this
memory cell is that it has very fast performance compared to other sub-systems because of its underlying
technology being simple CMOS transistors. SRAM cells are commonly implemented on the same IC
substrate as the rest of the system, thus allowing a fully integrated system to be realized. SRAM cells
are used for cache memory in computer systems.
To build an SRAM memory system, cells are arranged in an array pattern. Fig. 1.21 shows a 4x4
SRAM array topology. In this configuration, word lines are shared horizontally across the array in order
to provide addressing capability. An address decoder is used to convert the binary-encoded address
into the appropriate word line assertions. N storage cells are attached to the word line to provide the
desired data word width. Bit lines are shared vertically across the array in order to provide data access
(either read or write). A data line controller handles whether data is read from or written to the cells
based on an external write enable (WE) signal. When WE is asserted (WE = 1), data will be written to the
cells. When WE is de-asserted (WE = 0), data will be read from the cells. The data line controller also
handles determining the correct logic value read from the cells by comparing BL to BLn. As more cells
are added to the bit lines, the signal magnitude being driven by the storage cells diminishes due to the
additional loading of the other cells. This is where having complementary data signals (BL and BLn) is
advantageous because this effectively doubles the magnitude of the storage cell outputs. The comparison
of BL to BLn is handled using a differential amplifier that produces a full logic level output even when
the incoming signals are very small.
22
Figure 1.21: 4x4 SRAM array topology.
23
Figure 1.22: Operational Amplifier Symbol.
We can use either the "inverting" or the "non-inverting" op-amp input terminals to amplify a single
input signal with the other input being connected to ground.
Differential amplifiers are useful in electrically noisy environments where a low amplitude electrical
signal can be easily corrupted by the effect of unwanted external noise. In this scenario, a single-ended
amplifier would be unsuitable since it would also amplify the unwanted noise signal as well as the
desired input signal. A differential amplifier works on the principle that unwanted electrical noise cou-
ples equally onto both input terminals of the amplifier and will therefore be rejected allowing only the
wanted signal to be amplified.
Fig. 1.23 shows the required resistors and how they should be connected to op-amp inputs and output.
The op-amp output equation is as follows:
R3
Vout = ∗ (V2 − V1 ) (1.2)
R1
If all resistors in Fig. 1.23 are equal then we have a Unity Gain Differential Amplifier where
R1 = R2 = R3 = R4 and the op-amp equation will be the following:
24
Figure 1.24: Balanced microphone (has 3 wires in contrast to unbalanced mic that has 2).
If we have noise added to the microphone output then it will be added to both inverting and non-
inverting inputs, therefore we the noise will be canceled out:
We can also have single-input, two-output differential op-amp. Fully differential op-amps add a
second output. The output is fully differential - the two outputs are called positive output and negative
output - similar terminology to the two inputs. Like the inputs, they are differential. The output voltages
are equal, but opposite in polarity.
As we understood the operation of op-amps, we shall continue the SRAM topic.
25
SRAM is volatile memory because when the power is removed, the cross-coupled inverters are not
able to drive the feedback loop and the data is lost. SRAM is also read/write memory because the storage
cells can be easily read from or written to during normal operation. Let’s look at the operation of the
SRAM array when writing the 4-bit word "0111" to address "01"
Fig. 1.25 shows a graphical depiction of this operation. In this write cycle, the row address decoder
observes the address input "01" and asserts WL1. Asserting this word line enables all of the access
transistors (i.e., M1 and M2 in Fig. 1.20) of the storage cells in this row. The line drivers are designed to
have a stronger drive strength than the inverters in the storage cells so that they can override their values
during a write. The information "0111" is present on the Data_In bus, and the write enable control
line is asserted (WE = 1) to indicate a write. The data line controller passes the information to be stored
to the line drivers, which in turn converts each input into complementary signals (via differential two
output op-amps) and drives the bit lines. This overrides the information in each storage cell connected
to WL1. The address decoder then de-asserts WL1 and the information is stored.
Figure 1.25: SRAM operation during a write cycle – Storing "0111" to Address "01".
26
Now let’s look at the operation of the SRAM array when reading a 4-bit word from address "10".
Let’s assume that this row was storing the value "1010". Figure 10.13 shows a graphical depiction of this
operation. In this read cycle, the row address decoder asserts WL2, which allows the SRAM cells to drive
their respective bit lines. Note that each cell drives a complementary version of its stored value. The
input control line is de-asserted (WE = 0), which indicates that the sense amplifiers will read the BL and
BLn lines in order to determine the full logic value stored in each cell. This logic value is then routed to
the Data_Out port of the array. In an SRAM array, reading from the cell does not impact the contents
of the cell. Once the read is complete, WL2 is de-asserted and the read cycle is complete.
Figure 1.26: SRAM operation during a read cycle – Reading "0101" from Address "10".
27
schematic for the basic DRAM storage cell. The capacitor is accessed through a transistor (M1). Since
this storage element takes one transistor and one capacitor, it is often referred to as a 1T1C configuration.
Just as in SRAM memory, word lines are used to access the storage elements. The term digit line is used
to describe the vertical connection to the storage cells. DRAM has an advantage over SRAM in that
the storage element requires less area to implement. This allows DRAM memory to have much higher
density compared to SRAM.
There are a variety of considerations that must be accounted for when using DRAM. First, the charge
in the capacitor will slowly dissipate over time due to the capacitors being non-ideal. If left unchecked,
eventually the data held in the capacitor will be lost. In order to overcome this issue, DRAM has a
dedicated circuit to refresh the contents of the storage cell. A refresh cycle involves periodically reading
the value stored on the capacitor and then writing the same value back again at full signal strength. This
behavior also means that that DRAM is volatile because when the power is removed, and the refresh
cycle cannot be performed, the stored data is lost. DRAM is also considered read/write memory because
the storage cells can be easily read from or written to during normal operation.
Another consideration when using DRAM is that the voltage of the word line must be larger than
VCC in order to turn on the access transistor. In order to turn on an NMOS transistor, the gate terminal
must be larger than the source terminal by at least a threshold voltage (VT ). In traditional CMOS circuit
design, the source terminal is typically connected to ground (0v). This means the transistor can be easily
turned on by driving the gate with a logic 1 (i.e., VCC ) since this creates a VGS voltage much larger than
VT . This is not always the case in DRAM. In DRAM, the source terminal is not connected to ground but
rather to the storage capacitor. In the worst-case situation, the capacitor could be storing a logic 1 (i.e.,
VCC ). This means that in order for the word line to be able to turn on the access transistor, it must be
equal to or larger than (VCC + VT ). This is an issue because the highest voltage that the DRAM device
has access to is VCC . In DRAM, a charge pump is used to create a voltage larger than VCC + VT that
is driven on the word lines. Once this voltage is used, the charge is lost, so the line must be pumped
up again before its next use. The process of "pumping up" takes time that must be considered when
calculating the maximum speed of DRAM. Fig. 1.28 shows a graphical depiction of this consideration.
28
Figure 1.28: DRAM charge pumping of word lines.
29
The second state, the charge transfer state, has the two switches in their opposite configuration (left
switch up, right switch down). What happens in that case? Fig. 1.30 depicts the circuit. Assume that
C2 is initially at 0 V. C1, which had VIN volts across it prior to the switch, transfers some of its charge
to capacitor C2. As a result, the voltage across C2 rises while the voltage across C1 falls. The output
voltage (the C2 voltage) rises to a value between VIN and 2xVIN .
The circuit is then commanded to revert to its charging state (Fig. 1.29) and the voltage across C1
is replenished to VIN (the voltage across C2 remains at its charged value, assuming no load). When it
is then commanded to the charge transfer state, C1 again transfers some of its charge to C2. After this
charge transfer, the voltage across C2 is higher than during the previous charge transfer state. As this
process continues, the output voltage gradually approaches its final value of 2VIN . Once the circuit gets
past the initial transient voltage buildup, this circuit maintains the 2VIN output value.
Of course, we assumed no load on VOUT - not a very practical assumption for a circuit intended to
provide a supply voltage to some load. Nevertheless, as long as the load is relatively light, the circuit
will maintain very nearly double the input voltage.
Charge-pumped circuits similar to the voltage doubler can be built to provide higher voltages, usually
integer-multiples of the input voltage (that is, 2x , 3x , etc.).
Another consideration when using DRAM is how the charge in the capacitor develops into an actual
voltage on the digital line when the access transistor is closed. Consider the simple 4x4 array of DRAM
cells shown in Fig. 1.31. In this topology, the DRAM cells are accessed using the same approach as in
the SRAM array from Fig. 1.21.
One of the limitations of this simple configuration is that the charge stored in the capacitors cannot
30
develop a full voltage level across the digit line when the access transistor is closed. This is because
the digit line itself has capacitance that impacts how much voltage will be developed. In practice, the
capacitance of the digit line (CDL ) is much larger than the capacitance of the storage cell (CS ) due to
having significantly more area and being connected to numerous other storage cells. This becomes an
issue because when the storage capacitor is connected to the digit line, the resulting voltage on the digit
line (VDL ) is much less than the original voltage on the storage cell (VS ). This behavior is known as
charge sharing because when the access transistor is closed, the charge on both capacitors is distributed
across both devices and results in a final voltage that depends on the initial charge in the system and the
values of the two capacitors. Fig. 1.32 shows an example of how to calculate the final digit line voltage
when the storage cell is connected.
Figure 1.32: Calculating the final digit line voltage in a DRAM based on charge sharing.
31
The issue with the charge sharing behavior of a DRAM cell is that the final voltage on the word line
is not large enough to be detected by a standard logic gate or latch. In order to overcome this issue,
modern DRAM arrays use complementary storage cells and sense amplifiers. The complementary cells
store the original data and its complement. Two digit lines (DL and DLn) are used to read the contents
of the storage cells. DL and DLn are initially pre-charged to exactly VCC /2. When the access transistors
are closed, the storage cells will share their charge with the digit lines and move them slightly away
from VCC / 2 in different directions. This allows twice the voltage difference to be developed during a
read. A sense amplifier is then used to boost this small voltage difference into a full logic level that can
be read by a standard logic gate or latch. Fig. 1.33 shows the modern DRAM array topology based on
complementary storage cells.
Figure 1.33: Modern DRAM array topology based on complementary storage cells.
The sense amplifier is designed to boost small voltage deviations from VCC /2 on DL and DLn to full
logic levels. The sense amplifier sits in-between DL and DLn and has two complementary networks, the
N-sense amplifier and the P-sense amplifier. The N-sense amplifier is used to pull a signal that is below
VCC /2 (either DL or DLn) down to GND. A control signal (N-Latch or NLATn) is used to turn on this
32
network. The P-sense amplifier is used to pull a signal that is above VCC /2 (either DL or DLn) up to
VCC . A control signal (Active Pull-Up or ACT) is used to turn on this network. The two networks are
activated in a sequence with the N-sense network activating first. Fig. 1.34 shows an overview of the
operation of a DRAM sense amplifier.
Let’s now put everything together and look at the operation of a DRAM system during a read op-
eration. Fig. 1.35 shows a simplified timing diagram of a DRAM read cycle. This diagram shows the
critical signals and their values when reading a logic 1. Notice that there is a sequence of steps that must
be accomplished before the information in the storage cells can be retrieved.
33
Figure 1.35: DRAM operation during a read cycle – Reading a 1 from a storage cell.
A DRAM write operation is accomplished by opening the access transistors to the complementary
storage cells using WL, disabling the pre-charge drivers and then writing full logic level signals to the
storage cells using the Data_In line driver.
In-class Question 3: Which of the following is suitable for implementation in a read/write memory?
34
1.4 Modeling Memory with Verilog
1.4.1 Read-Only Memory in Verilog
A read-only memory in Verilog can be defined in two ways. The first is to simply use a case statement
to define the contents of each location in memory based on the incoming address. A second approach
is to declare an array and then initialize its contents. When using an array, a separate procedural block
handles assigning the contents of the array to the output based on the incoming address. The array
can be initialized using either an initial block or through the file I/O system tasks $readmemb() or
$readmemh(). Fig. 1.36 shows the symbol of a 4x4 asynchronous read-only memory. Listing 1.1 and
listing 1.2 show two approaches for modeling a 4x4 ROM memory.
In those listings the memory is asynchronous, meaning that as soon as the address changes, the data
from the ROM will appear immediately. To model this asynchronous behavior, the procedural blocks are
sensitive to the incoming address. In the simulation (Fig. 1.37), each possible address is provided (i.e.,
"00", "01", "10", and "11") to verify that the ROM was initialized correctly. Fig. Fig. 1.37 shows that
data_out is updated immediately when the address is changed.
4 always @ ( address )
5 case ( address )
6 0 : data_out = 4 ’ b1110 ;
7 1 : data_out = 4 ’ b0010 ;
8 2 : data_out = 4’ bl1l1 ;
9 3 : data out = 4 ’ b0100 ;
10 default : data_out = 4 ’bXXXX;
11 endcase
12
13 endmodule
Listing 1.2: Behavioral models of a 4x4 asynchronous read-only memory in Verilog (v2).
1 module r o m _ 4 x 4 _ a s y n c 2 ( o u t p u t r e g [ 3 : 0 ] d a t a _ o u t ,
2 input wire [ 1 : 0 ] address )
3
4 r e g [ 3 : 0 ] ROM[ 0 : 3 ] ; / / An MxN a r r a y i s d e c l a r e d
5
35
6 i n i t i a l begin
7 ROM[ 0 ] = 4 ’ b1110 ;
8 ROM[ 1 ] = 4 ’ b0010 ;
9 ROM[ 2 ] = 4 ’ b l 1 l 1 ;
10 ROM[ 3 ] = 4 ’ b0100 ;
11 end
12
13 always @ ( address )
14 d a t a _ o u t = ROM[ a d d r e s s ] ;
15
16 endmodule
A synchronous ROM as shown in Fig. 1.38 can be created in a similar manner as in the asynchronous
approach. The only difference is that in a synchronous ROM, a clock edge is used to trigger the pro-
cedural block that updates data_out. A sensitivity list is used that contains the clock to trigger the
assignment. Listing 1.3 and Listing 1.4 show two Verilog models for a synchronous ROM. Notice in
simulation waveform of Fig. 1.39 that prior to the first clock edge, the simulator does not know what to
assign to data_out, so it lists the value as unknown (X).
36
9 2 : data_out = 4’ bl1l1 ;
10 3 : d a t a o u t = 4 ’ b0100 ;
11 d e f a u l t : d a t a _ o u t = 4 ’bXXXX;
12 endcase
13
14 endmodule
Listing 1.4: Behavioral models of a 4x4 synchronous read-only memory in Verilog (v2).
1 module rom_4x4_sync ( o u t p u t r e g [ 3 : 0 ] d a t a _ o u t ,
2 input wire [ 1 : 0 ] address ,
3 i n p u t wire Clock )
4
5 i n i t i a l begin
6 ROM[ 0 ] = 4 ’ b1110 ;
7 ROM[ 1 ] = 4 ’ b0010 ;
8 ROM[ 2 ] = 4 ’ b l 1 l 1 ;
9 ROM[ 3 ] = 4 ’ b0100 ;
10 end
11
15 endmodule
37
Figure 1.40: A 4x4 asynchronous read/write memory.
Listing 1.5 shows an asynchronous R/W 4x4 memory system and its functional simulation results in
Fig. 1.41. In the simulation, each address is initially read from to verify that it does not contain data.
The data_out port produces unknown (X) for the initial set of read operations. Each address in the
array is then written to. Finally, the array is read from verifying that the data that was written can be
successfully retrieved.
Listing 1.5: Behavioral models of a 4x4 asynchronous read-write memory in Verilog.
1 module r w _ 4 x 4 _ a s y n c ( o u t p u t r e g [ 3 : 0 ] d a t a _ o u t ,
2 input wire [ 1 : 0 ] address ,
3 i n p u t w i r e WE,
4 input wire [ 3 : 0 ] data_in )
5
6 r e g [ 3 : 0 ] RW[ 0 : 3 ] ; / / An MxN a r r a y i s d e c l a r e d
7
15 endmodule
38
A synchronous read/write memory (Fig. 1.42) is made in a similar manner with the exception that a
clock is used to trigger the procedural block managing the signal assignments. In this case, the WE signal
acts as a synchronous control signal indicating whether assignments are read from or written to the RW
array.
Listing 1.6 shows the Verilog model for a synchronous read/write memory and the simulation wave-
form in Fig. 1.43 shows both read and write cycles.
Listing 1.6: Behavioral models of a 4x4 synchronous read-write memory in Verilog.
1 module r w _ 4 x 4 _ s y n c ( o u t p u t r e g [ 3 : 0 ] d a t a _ o u t ,
2 input wire [ 1 : 0 ] address ,
3 i n p u t w i r e WE,
4 input wire [ 3 : 0 ] data_in ,
5 i n p u t wire Clock )
6
7 r e g [ 3 : 0 ] RW[ 0 : 3 ] ;
8
16 endmodule
39
Figure 1.43: Functional simulation waveform of a 4x4 synchronous read/write memory.
In-class Question 4: Explain the advantage of modeling memory in Verilog without going into the
details of the storage cell operation.
A) It allows the details of the storage cell to be abstracted from the functional operation of the memory
system.
C) There are too many cells to model so the simulation would take too long.
40
1.5 Assignments
1. For a 512k x 32 memory system, how many unique address locations are there? Give the exact
number.
2. For a 512k x 32 memory system, what is the data width at each address location?
5. For a 512k x 32 memory system, how wide does the incoming address bus need to be in order to
access every unique address location?
6. Name the type of memory with the following characteristic: when power is removed, the data is
lost.
7. Name the type of memory with the following characteristic: when power is removed, the memory
still holds its information.
8. Name the type of memory with the following characteristic: it can only be read from during normal
operation.
9. Name the type of memory with the following characteristic: during normal operation, it can be
read and written to.
10. Name the type of memory with the following characteristic: data can be accessed from any address
location at any time.
11. Name the type of memory with the following characteristic: data can only be accessed in consec-
utive order, thus not every location of memory is available instantaneously.
12. Name the type of memory with the following characteristic: this memory is non-volatile, read-
/write, and only provides data access in blocks.
13. Name the type of memory with the following characteristic: this memory uses a floating-gate
transistor, can be erased with electricity, and provides individual bit access.
14. Name the type of memory with the following characteristic: this memory is non-volatile, read-
/write, and provides word-level data access.
15. Name the type of memory with the following characteristic: this memory uses a floating-gate
transistor that is erased with UV light.
16. Name the type of memory with the following characteristic: this memory is programmed by blow-
ing fuses or anti-fuses.
17. Name the type of memory with the following characteristic: this memory is partially fabricated
prior to knowing the information to be stored.
41
20. Design a Verilog model for the SRAM system shown in Fig. 1.44. Your storage cell should be de-
signed such that its contents can be overwritten by the line driver. Consider using signal strengths
for this behavior (e.g., strong1 will overwrite a weak0). You will need to create a system for
the differential line driver with enable. This driver will need to contain a high impedance state
when disabled. Both your line driver (Din) and receiver (Dout) are differential. These systems
can be modeled using simple if-else statements. Create a test bench for your system that will write
a 0 to the cell, then read it back to verify the 0 was stored, and then repeat the write/read cycles for
a 1.
22. Why is a charge pump necessary on the word lines of a DRAM array?
42
24. For the DRAM storage cell shown in Fig. 1.45, solve for the final voltage on the digit line after
the access transistor (M1) closes if initially VS = VCC (i.e., the cell is storing a 1). In this system,
CS = 5pF, CDL = 10pF, and VCC = +3.4v. Prior to the access transistor closing, the digit line is
pre-charged to VCC /2.
25. For the DRAM storage cell shown in Fig. 1.45, solve for the final voltage on the digit line after
the access transistor (M1) closes if initially VS = GND (i.e., the cell is storing a 0). In this system,
CS = 5pF, CDL = 10pF, and VCC = +3.4v. Prior to the access transistor closing, the digit line is
pre-charged to VCC /2.
26. Design a Verilog model for the 16x8, asynchronous, read-only memory system shown in Fig. 1.46.
The system should contain the information provided in the memory map. Create a test bench to
simulate your model by reading from each of the 16 unique addresses and observing data_out
to verify it contains the information in the memory map.
43
27. Design a Verilog model for the 16x8, synchronous, read-only memory system shown in Fig. 1.47.
The system should contain the information provided in the memory map. Create a test bench to
simulate your model by reading from each of the 16 unique addresses and observing data_out
to verify it contains the information in the memory map.
28. Design a Verilog model for the 16x8, asynchronous, read/write memory system shown in Fig. 1.48.
Create a test bench to simulate your model. Your test bench should first read from all of the address
locations to verify they are uninitialized. Next, your test bench should write unique information to
each of the address locations. Finally, your test bench should read from each address location to
verify that the information that was written was stored and can be successfully retrieved.
44
29. Design a Verilog model for the 16x8, synchronous, read/write memory system shown in Fig. 1.49.
Create a test bench to simulate your model. Your test bench should first read from all of the address
locations to verify they are uninitialized. Next, your test bench should write unique information to
each of the address locations. Finally, your test bench should read from each address location to
verify that the information that was written was stored and can be successfully retrieved
45