Transcript
A semiconductor memory system is subject to errors. These can be categorized as hard failures and soft
errors. A hard failure is a permanent physical defect so that the memory cell or cells affected cannot
reliably store data but become stuck at 0 or 1 or switch erratically between 0 and 1. Hard errors can be
caused by harsh environmental abuse, manufacturing defects, and wear. A soft error is a random,
nondestructive event that alters the contents of one or more memory cells with-out damaging the
memory. Soft errors can be caused by power supply problems or alpha particles. These particles result
from radioactive decay and are distress-ingly common because radioactive nuclei are found in small
quantities in nearly all materials. Both hard and soft errors are clearly undesirable, and most modern
main memory systems include logic for both detecting and correcting errors.
Figure 5.7 illustrates in general terms how the process is carried out. When data are to be written into
memory, a calculation, depicted as a function f, is per-formed on the data to produce a code. Both the
code and the data are stored. Thus, if an M-bit word of data is to be stored and the code is of length K
bits, then the actual size of the stored word is M + K bits.
When the previously stored word is read out, the code is used to detect and pos-sibly correct errors. A
new set of K code bits is generated from the M data bits and compared with the fetched code bits. The
comparison yields one of three results:
• No errors are detected. The fetched data bits are sent out.
• An error is detected, and it is possible to correct the error. The data bits plus error correction bits are
fed into a corrector, which produces a corrected set of M bits to be sent out.
• An error is detected, but it is not possible to correct it. This condition is reported.
Codes that operate in this fashion are referred to as error-correcting codes. A code is characterized by
the number of bit errors in a word that it can correct and detect.