Computer Architecture and Assembly Language
Computer Architecture and Assembly Language
The input data travels from input unit to ALU. Similarly, the computed data travels from ALU to
output unit. The data constantly moves from storage unit to ALU and back again. This is because
stored data is computed on before being stored again. The control unit controls all the other units
as well as their data.
Details about all the computer units are −
Input Unit
The input unit provides data to the computer system from the outside. So, basically it
links the external environment with the computer. It takes data from the input devices,
converts it into machine language and then loads it into the computer system. Keyboard,
mouse etc. are the most commonly used input devices.
Computer Organization and Assembly Language
Output Unit
The output unit provides the results of computer process to the users i.e it links the
computer with the external environment. Most of the output data is the form of audio or
video. The different output devices are monitors, printers, speakers, headphones etc.
Storage Unit
Storage unit contains many computer components that are used to store data. It is
traditionally divided into primary storage and secondary storage. Primary storage is also
known as the main memory and is the memory directly accessible by the CPU.
Secondary or external storage is not directly accessible by the CPU. The data from
secondary storage needs to be brought into the primary storage before the CPU can use it.
Secondary storage contains a large amount of data permanently.
Control Unit
This unit controls all the other units of the computer system and so is known as its central
nervous system. It transfers data throughout the computer as required including from
storage unit to central processing unit and vice versa. The control unit also dictates how
the memory, input output devices, arithmetic logic unit etc. should behave.
Address, Data, and Control Buses
A computer system comprises of a processor, memory, and I/O devices. I/O is used for
interfacing with the external world, while memory is the processor’s internal world. Processor is
the core in this picture and is responsible for performing operations. The operation of a computer
can be fairly described with processor and memory only. I/O will be discussed in a later part of
the course. Now the whole working of the computer is performing an operation by the processor
on data, which resides in memory. The scenario that the processor executes operations and the
memory contains data elements requires a mechanism for the processor to read that data from the
memory. “That data” in the previous sentence much be rigorously explained to the memory
which is a dumb device. Just like a postman, who must be told the precise address on the letter,
to inform him where the destination is located. Another significant point is that if we only want
to read the data and not write it, then there must be a There must be a mechanism to inform
memory that we want to do the read operation • There must be a mechanism to inform memory
that we want to read precisely which element • There must be a mechanism to transfer that data
element from memory to processor The group of bits that the processor uses to inform the
memory about which element to read or write is collectively known as the address bus. Another
important bus called the data bus is used to move the data from the memory to the processor in a
Computer Organization and Assembly Language
read operation and from the processor to the memory in a write operation. The third group
consists of miscellaneous independent lines used for control purposes. For example, one line of
the bus is used to inform the memory about whether to do the read operation or the write
operation. These lines are collectively known as the control bus. These three buses are the eyes,
nose, and ears of the processor. It uses them in a synchronized manner to perform a meaningful
operation. Although the programmer specifies the meaningful operation, but to fulfill it the
processor needs the collaboration of other units and peripherals. And that collaboration is made
available using the three buses. This is the very basic description of a computer and it can be
extended on the same lines to I/O but we are leaving it out just for simplicity for the moment.
The address bus is unidirectional and address always travels from processor to memory. This is
because memory is a dumb device and cannot predict which element the processor at a particular
instant of time needs. Data moves from both, processor to memory and memory to processor, so
the data bus is bidirectional. Control bus is special and relatively complex, because different
lines comprising it behave differently. Some take mechanism to inform the memory that we are
interested in reading data and not writing it. Key points in the above discussion are:
Information from the processor to a peripheral and some take information from the peripheral to
the processor. There can be certain events outside the processor that are of its interest. To bring
information about these events the data bus cannot be used as it is owned by the processor and
will only be used when the processor grants permission to use it. Therefore certain processors
provide control lines to bring such information to processor’s notice in the control bus. Knowing
these signals in detail is unnecessary but the general idea of the control bus must be conceived in
full.
Processors read and interpret instructions stored in memory
The processor initiates a read bus cycle by floating the address of the memory location on the
address lines.
Once the address lines are stable, the processor asserts the address strobe signal on the bus.
The processor then sets the Read/Write signal to high, i.e. read.
An address register, which keeps track of where a given instruction or piece of data is stored in
memory. Each storage location in memory is identified by an address, just as each house on a
street has an address. A storage register, which temporarily holds data taken from or about to be
sent to memory.
Bits & Bytes
A bit is a”Binary digit” and represents the smallest unit of data measurement. They are mostly
grouped into bytes, with each byte containing exactly 8 bits. A byte is often referred to as the
smallest amount of data and is used to store data and execute instructions. As each bit is a binary
digit and can either be 0 or 1, one byte can represent 256 different combinations of 0 and 1.
Therefore, adding more bytes and bits reveals more and more possibilities to represent different
instructions.
Computer Organization and Assembly Language
If you either talk about bits or bytes depends on the context. Bits are mostly used when referring
to network and download speeds. You’ve probably seen your telecom provider offering internet
speed of e. g. 100MB/s. This means that 100 megabits (1 megabit = approx. 1'000'000 bits) can
be uploaded or downloaded per second. Bytes, on the other hand, are used when talking about
memory storage. A USB-Stick with 1 gigabyte of storage holds storage for approx. 1 billion
bytes.
Information = Bits + Context
Up until now, we have talked about the first part of the information: the bits. We know by now
that everything inside a machine goes back to long strings of 0 and 1, which are somehow
encoded to give them a meaning. But how does this work? It depends on the context. Given a
specific context, bits can be encoded respectively. The following two examples illustrate this.
Example 1: The clock at the train station in St. Gallen
Let’s look at it from the angle of “information”. What we see in the picture below are lights that
are turned on or off and some geometric figures. Without any context, we would not know that
this represents a time. But given this context, we can encode these lights turned on and off as
binary digits and encode them as the hour, minute, and second. Therefore, each geometric figure
that is turned on represents a 1 and each figure turned off a 0. Let's look at the picture below and
convert the first row. In binary, this would yield into 00111, which is 7 in decimal number
(4+2+1). The second row shows 010100. This is 20 in the decimal number system. The last row
displays 101000, which represents 40. Therefore, given the context of a clock, we know that
these strange lights on the wall of the train station actually represent the time 07:20:40.
work in some way towards getting a high-level programming language translated into machine
code that the central processing unit (CPU) can understand. Examples of CPUs include those
made by Intel (e.g., x86), AMD (e.g., Athlon APU), NXP (e.g., PowerPC), and many others. It’s
important to note that all translators, compilers, interpreters and assemblers are programs
themselves.
Translators
The most general term for a software code converting tool is “translator.” A translator, in
software programming terms, is a generic term that could refer to a compiler, assembler, or
interpreter; anything that converts higher level code into another high-level code (e.g., Basic, C+
+, Fortran, Java) or lower-level (i.e., a language that the processor can understand), such as
assembly language or machine code. If you don’t know what the tool actually does other than
that it accomplishes some level of code conversion to a specific target language, then you can
safely call it a translator.
Compilers
Compilers convert high-level language code to machine (object) code in one session. Compilers
can take a while, because they have to translate high-level code to lower-level machine language
all at once and then save the executable object code to memory. A compiler creates machine
code that runs on a processor with a specific Instruction Set Architecture (ISA), which is
processor-dependent. For example, you cannot compile code for an x86 and run it on a MIPS
architecture without a special compiler. Compilers are also platform-dependent. That is, a
compiler can convert C++, for example, to machine code that’s targeted at a platform that is
running the Linux OS. A cross-compiler, however, can generate code for a platform other than
the one it runs on itself.
Interpreters
Another way to get code to run on your processor is to use an interpreter, which is not the same
as a compiler. An interpreter translates code like a compiler but reads the code and immediately
executes on that code, and therefore is initially faster than a compiler. Thus, interpreters are often
used in software development tools as debugging tools, as they can execute a single in of code at
a time. Compilers translate code all at once and the processor then executes upon the machine
language that the compiler produced. If changes are made to the code after compilation, the
changed code will need to be compiled and added to the compiled code (or perhaps the entire
program will need to be re-compiled.) But an interpreter, although skipping the step of
compilation of the entire program to start, is much slower to execute than the same program
that’s been completely compiled.
Assemblers
An assembler translates a program written in assembly language into machine language and is
effectively a compiler for the assembly language, but can also be used interactively like an
interpreter. Assembly language is a low-level programming language. Low-level programming
Computer Organization and Assembly Language
languages are less like human language in that they are more difficult to understand at a glance;
you have to study assembly code carefully in order to follow the intent of execution and in most
cases, assembly code has many more lines of code to represent the same functions being
executed as a higher-level language. An assembler converts assembly language code into
machine code (also known as object code), an even lower-level language that the processor can
directly understand.
What is Memory Hierarchy?
The memory in a computer can be divided into five hierarchies based on the speed as well as use.
The processor can move from one level to another based on its requirements. The five
hierarchies in the memory are registers, cache, main memory, magnetic discs, and magnetic
tapes. The first three hierarchies are volatile memories which mean when there is no power, and
then automatically they lose their stored data. Whereas the last two hierarchies are not volatile
which means they store the data permanently.
A memory element is the set of storage devices which stores the binary data in the type of bits. In
general, the storage of memory can be classified into two categories such as volatile as well as
non- volatile.
An operating system is the most important software that runs on a computer. It manages the
computer's memory and processes, as well as all of its software and hardware. It also allows you
to communicate with the computer without knowing how to speak the computer's language.
Without an operating system, a computer is useless. Your computer's operating system (OS)
manages all of the software and hardware on the computer. Most of the time, there are several
different computer programs running at the same time, and they all need to access your
computer's central processing unit (CPU), memory, and storage. The operating system
coordinates all of this to make sure each program gets what it needs.
Data Representation
Data refers to the symbols that represent people, events, things, and ideas. Data can be a name, a
number, the colors in a photograph, or the notes in a musical composition. Data Representation
refers to the form in which data is stored, processed, and transmitted. Devices such as
smartphones, iPods, and computers store data in digital formats that can be handled by electronic
circuitry.
Digitization is the process of converting information, such as text, numbers, photos, or music,
into digital data that can be manipulated by electronic devices. The Digital Revolution has
evolved through four phases, beginning with big, expensive, standalone computers, and
progressing to today’s digital world in which small, inexpensive digital devices are everywhere.
Data Representation. The 0s and 1s used to represent digital data are referred to as binary digits
from this term we get the word bit that stands for binary digit. A bit is a 0 or 1 used in the digital
representation of data. A digital file, usually referred to simply as a file, is a named collection of
data that exits on a storage medium, such as a hard disk, CD, DVD, or flash drive.
Representing Numbers
Numeric data consists of numbers that can be used in arithmetic operations. Digital devices
represent numeric data using the binary number system, also called base 2. The binary number
system only has two digits: 0 and 1. No numeral like 2 exists in the system, so the number “two”
is represented in binary as 10 (pronounced “one zero”).
Representing Text
Character data is composed of letters, symbols, and numerals that are not used in calculations.
Examples of character data include your name, address, and hair color. Character data is
commonly referred to as “text.”
Integer Representation
Representing integer numbers refers to how the computer stores or represents a number in
memory. The computer represents numbers in binary (1's and 0's). However, the computer has a
limited amount of space that can be used for each number or variable. This directly impacts the
size, or range, of the number that can be represented. For example, a byte (8-bits) can be used to
represent 2^ {8} or 256 different numbers. Those 256 different numbers can be unsigned (all
positive) in which case we can represent any number between 0 and 255 (inclusive). If we
Computer Organization and Assembly Language
choose signed (positive and negative values), then we can represent any number between -128
and +127 (inclusive).
If that range is not large enough to handle the intended values, a larger size must be used. For
example, a word (16-bits) can be used to represent 2^ {16} or 65,536 different values, and a
double-word (32-bits) can be used to represent 2^ {32} or 4,294,967,296 different numbers. So,
if you wanted to store a value of 100,000 then a double-word would be required.
As you may recall from C, C++, or Java, an integer declaration (e.g., int <variable>) is a single
double-word that can be used to represent values between -2^ {31} (−2,147,483,648) and
+2^{31} - 1 (+2,147,483,647).
The following table shows the ranges associated with typical sizes:
Size Size Unsigned Range Signed Range
One of the first single-chip, 16-bit microprocessors. The 8088, a variant of the 8086 with an 8-
bit external bus, formed the heart of the original IBM personal computers. IBM contracted with
then-tiny Microsoft to develop the MS-DOS operating system. The original models came with
32,768 bytes of memory and two floppy drives (no hard drive). Architecturally, the machines
were limited to a 655,360-byte address space—addresses were only 20 bits long (1,048,576 bytes
addressable), and the operating system reserved 393,216 bytes for its use. In 1980, Intel
introduced the 8087 floating-point coprocessor (45 K transistors) to operate alongside an 8086 or
8088 processor, executing the floating-point instructions. The 8087 established the floating-point
model for the x86 line, often referred to as “x87.”
80286
(1982, 134 K transistors). Added more (and now obsolete) addressing modes. Formed the basis
of the IBM PC-AT personal computer, the original platform for MS Windows.
i386
(1985, 275 K transistors). Expanded the architecture to 32 bits. Added the flat addressing model
used by Linux and recent versions of the Windows family of operating systems. This was the
first machine in the series that could support a Unix operating system.
i486
(1989, 1.2 M transistors). Improved performance and integrated the floating-point unit onto the
processor chip but did not significantly change the instruction set.
Pentium
(1993, 3.1 M transistors). Improved performance, but only added minor extensions to the
instruction set.
PentiumPro
(1995, 5.5 M transistors). Introduced a radically new processor design, internally known as the
P6 microarchitecture. Added a class of “conditional move” instructions to the instruction set.
Pentium II
(1997, 7 M transistors). Continuation of the P6 microarchitecture.
Pentium III
(1999, 8.2 M transistors). Introduced SSE, a class of instructions for manipulating vectors of
integer or floating-point data. Each datum can be 1, 2, or 4 bytes, packed into vectors of 128 bits.
Later versions of this chip went up to 24 M transistors, due to the incorporation of the level-2
cache on the chip.
Pentium 4
(2000, 42 M transistors). Extended SSE to SSE2, adding new data types (including double-
precision floating point), along with 144 new instructions for these formats. With these
Computer Organization and Assembly Language
extensions, compilers can use SSE instructions, rather than x87 instructions, to compile floating-
point code. Introduced the NetBurst microarchitecture, which could operate at very high clock
speeds, but at the cost of high power consumption.
Pentium 4E
(2004, 125 M transistors). Added hyper-threading, a method to run two programs simultaneously
on a single processor, as well as EM64T, Intel’s implementation of a 64-bit extension to IA32
developed by Advanced Micro Devices (AMD), which we refer to as x86-64.
Core 2
(2006, 291 M transistors). Returned to a microarchitecture similar to P6. First multi-core Intel
microprocessor, where multiple processors are implemented on a single chip. Did not support
hyperthreading.
Core i7
(2008, 781 M transistors). Incorporated both hyperthreading and multi-core, with the initial
version supporting two executing programs on each core and up to four cores on each chip.
Moore's Law
Moore's Law refers to Gordon Moore's perception that the number of transistors on a microchip
doubles every two years, though the cost of computers is halved. Moore's Law states that we can
expect the speed and capability of our computers to increase every couple of years, and we will
pay less for them. Another tenet of Moore's Law asserts that this growth is exponential.
Computer Organization and Assembly Language
ASCII
ASCII, an abbreviation of the American Standard Code for Information Interchange, is a
standard data-transmission code that is used by smaller and less powerful computers to represent
both textual data (letters, numbers, and punctuation marks) and noninput-device commands
(control characters). Like other coding systems, it converts information into standardized digital
formats that allow computers to communicate with each other and to efficiently process and store
data. The ASCII code was originally developed for teletypewriters but eventually found wide
application in personal computers. The standard ASCII code uses seven-digit binary numbers;
i.e., numbers consisting of various sequences of 0’s and 1’s. The code can represent 128 different
characters since there are 128 different possible combinations of seven 0’s and 1’s. The binary
sequence 1010000, for example, represents an uppercase “P,” while the sequence 1110000
represents a lowercase “p.”
Digital computers use a binary code that is arranged in groups of eight rather than of seven
digits, or bits. Each such eight-digit group is called a byte. Because digital computers use eight-
bit bytes, the ASCII code is commonly embedded in an eight-bit field consisting of the seven
information bits and a parity bit that is used for error-checking purposes or to represent special
symbols. The use of an eight-bit system increased the number of characters the code could
represent to 256. The eight-bit system, which is known as the extended ASCII code, was
introduced in 1981 by the International Business Machines Corporation (IBM) for use with its
first model of personal computer. This extended ASCII code soon became the industry-wide
Computer Organization and Assembly Language
standard for personal computers. In it, 32 code combinations are used for machine and control
commands, such as “start of text,” “carriage return,” and “form feed.” The next group of 32
combinations is used for numbers and various punctuation symbols. Another group of 32
combinations is used for uppercase letters and a few other punctuation marks, and the last 32 are
used for lowercase letters.
ASCII control characters (character code 0-31)
The first 32 characters in the ASCII table are unprintable control codes and are used to control
peripherals such as printers.
HTML HTML
DEC OCT HEX BIN Symbol Number Name Description
0 000 00 00000000 NUL � Null char
1 001 01 00000001 SOH  Start of Heading
2 002 02 00000010 STX  Start of Text
3 003 03 00000011 ETX  End of Text
4 004 04 00000100 EOT  End of Transmission
5 005 05 00000101 ENQ  Inquiry
6 006 06 00000110 ACK  Acknowledgment
7 007 07 00000111 BEL  Bell
8 010 08 00001000 BS  Back Space
9 011 09 00001001 HT 	 Horizontal Tab
10 012 0A 00001010 LF 
 Line Feed
11 013 0B 00001011 VT  Vertical Tab
12 014 0C 00001100 F  Form Feed
13 015 0D 00001101 CR 
 Carriage Return
14 016 0E 00001110 SO  Shift Out / X-On
15 017 0F 00001111 SI  Shift In / X-Off
16 020 10 00010000 DATE  Data Line Escape
17 021 11 00010001 DC1  Device Control 1 (oft. XON)
18 022 12 00010010 DC2  Device Control 2
19 023 13 00010011 DC3  Device Control 3 (oft. XOFF)
20 024 14 00010100 DC4  Device Control 4
21 025 15 00010101 NAK  Negative Acknowledgement
22 026 16 00010110 SYN  Synchronous Idle
23 027 17 00010111 ETB  End of Transmit Block
24 030 18 00011000 CAN  Cancel
25 031 19 00011001 EM  End of Medium
26 032 1A 00011010 SUB  Substitute
27 033 1B 00011011 ESC  Escape
28 034 1C 00011100 FS  File Separator
Computer Organization and Assembly Language
HTML HTML
DEC OCT HEX BIN Symbol Number Name Description
29 035 1D 00011101 GS  Group Separator
30 036 1E 00011110 RS  Record Separator
31 037 1F 00011111 US  Unit Separator
Codes 32-127 are common for all the different variations of the ASCII table, they are called
printable characters, and represent letters, digits, punctuation marks, and a few miscellaneous
symbols. You will find almost every character on your keyboard. Character 127 represents the
command DEL.
HTML HTML
DEC OCT HEX BIN Symbol Number Name Description
32 040 20 00100000   Space
33 041 21 00100001 ! ! Exclamation mark
34 042 22 00100010 " " " Double quotes (or speech marks)
35 043 23 00100011 # # Number
36 044 24 00100100 $ $ Dollar
37 045 25 00100101 % % Per cent sign
38 046 26 00100110 & & & Ampersand
39 047 27 00100111 ' ' Single quote
40 050 28 00101000 ( ( Open parenthesis (or open bracket)
41 051 29 00101001 ) ) Close parenthesis (or close
bracket)
42 052 2A 00101010 * * Asterisk
43 053 2B 00101011 + + Plus
44 054 2C 00101100 , , Comma
45 055 2D 00101101 - - Hyphen
46 056 2E 00101110 . . Period, dot, or full stop
47 057 2F 00101111 / / Slash or divide
48 060 30 00110000 0 0 Zero
49 061 31 00110001 1 1 One
50 062 32 00110010 2 2 Two
51 063 33 00110011 3 3 Three
52 064 34 00110100 4 4 Four
53 065 35 00110101 5 5 Five
54 066 36 00110110 6 6 Six
55 067 37 00110111 7 7 Seven
56 070 38 00111000 8 8 Eight
Computer Organization and Assembly Language
HTML HTML
DEC OCT HEX BIN Symbol Number Name Description
57 071 39 00111001 9 9 Nine
58 072 3A 00111010 : : Colon
59 073 3B 00111011 ; ; Semicolon
60 074 3C 00111100 < < < Less than (or open angled bracket)
61 075 3D 00111101 = = Equals
62 076 3E 00111110 > > > Greater than (or close angled
bracket)
63 077 3F 00111111 ? ? Question mark
64 100 40 01000000 @ @ At symbol
65 101 41 01000001 A A Uppercase A
66 102 42 01000010 B B Uppercase B
67 103 43 01000011 C C Uppercase C
68 104 44 01000100 D D Uppercase D
69 105 45 01000101 E E Uppercase E
70 106 46 01000110 F F Uppercase F
71 107 47 01000111 G G Uppercase G
72 110 48 01001000 H H Uppercase H
73 111 49 01001001 I I Uppercase I
74 112 4A 01001010 J J Uppercase J
75 113 4B 01001011 K K Uppercase K
76 114 4C 01001100 L L Uppercase L
77 115 4D 01001101 M M Uppercase M
78 116 4E 01001110 N N Uppercase N
79 117 4F 01001111 O O Uppercase O
80 120 50 01010000 P P Uppercase P
81 121 51 01010001 Q Q Uppercase Q
82 122 52 01010010 R R Uppercase R
83 123 53 01010011 S S Uppercase S
84 124 54 01010100 T T Uppercase T
85 125 55 01010101 U U Uppercase U
86 126 56 01010110 V V Uppercase V
87 127 57 01010111 W W Uppercase W
88 130 58 01011000 X X Uppercase X
89 131 59 01011001 Y Y Uppercase Y
90 132 5A 01011010 Z Z Uppercase Z
91 133 5B 01011011 [ [ Opening bracket
92 134 5C 01011100 \ \ Backslash
93 135 5D 01011101 ] ] Closing bracket
Computer Organization and Assembly Language
HTML HTML
DEC OCT HEX BIN Symbol Number Name Description
94 136 5E 01011110 ^ ^ Caret - circumflex
95 137 5F 01011111 _ _ Underscore
96 140 60 01100000 ` ` Grave accent
97 141 61 01100001 a a Lowercase a
98 142 62 01100010 b b Lowercase b
99 143 63 01100011 c c Lowercase c
100 144 64 01100100 d d Lowercase d
101 145 65 01100101 e e Lowercase e
102 146 66 01100110 f f Lowercase f
103 147 67 01100111 g g Lowercase g
104 150 68 01101000 h h Lowercase h
105 151 69 01101001 i i Lowercase i
106 152 6A 01101010 j j Lowercase j
107 153 6B 01101011 k k Lowercase k
108 154 6C 01101100 l l Lowercase l
109 155 6D 01101101 m m Lowercase m
110 156 6E 01101110 n n Lowercase n
111 157 6F 01101111 o o Lowercase o
112 160 70 01110000 p p Lowercase p
113 161 71 01110001 q q Lowercase q
114 162 72 01110010 r r Lowercase r
115 163 73 01110011 s s Lowercase s
116 164 74 01110100 t t Lowercase t
117 165 75 01110101 u u Lowercase u
118 166 76 01110110 v v Lowercase v
119 167 77 01110111 w w Lowercase w
120 170 78 01111000 x x Lowercase x
121 171 79 01111001 y y Lowercase y
122 172 7A 01111010 z z Lowercase z
123 173 7B 01111011 { { Opening brace
124 174 7C 01111100 | | Vertical bar
125 175 7D 01111101 } } Closing brace
126 176 7E 01111110 ~ ~ Equivalency sign - tilde
127 177 7F 01111111  Delete
Unicode
Computer Organization and Assembly Language
Unicode is a universal character encoding standard. It defines the way individual characters are
represented in text files, web pages, and other types of documents.
Unlike ASCII, which was designed to represent only Basic English characters, Unicode was
designed to support characters from all languages around the world. The standard ASCII
character set only supports 128 characters, while Unicode can support roughly 1,000,000
characters. While ASCII only uses one byte to represent each character, Unicode supports up to
4 bytes for each character.
There are several different types of Unicode encodings, though UTF-8 and UTF-16 are the most
common. UTF-8 has become the standard character encoding used on the Web and is also the
default encoding used by many software programs. While UTF-8 supports up to four bytes per
character, it would be inefficient to use four bytes to represent frequently used characters.
Therefore, UTF-8 uses only one byte to represent common English characters. European (Latin),
Hebrew, and Arabic characters are represented with two bytes, while three bytes are used for
Chinese, Japanese, Korean, and other Asian characters. Additional Unicode characters can be
represented with four bytes.
Encoding and Decoding
Encoding is the process of putting a sequence of characters (letters, numbers, punctuation, and
certain symbols) into a specialized format for efficient transmission or storage. Decoding is the
opposite process -- the conversion of an encoded format back into the original sequence of
characters.
These terms should not be confused with encryption and decryption, which focus on hiding and
securing data. (We can encrypt data without changing the code or encode data without
deliberately concealing the content.)
Assembly Language
In computer programming, assembly language (or assembler language), sometimes abbreviated
asm, is any low-level programming language in which there is a very strong correspondence
between the instructions in the language and the architecture's machine code instructions.
Assembly language usually has one statement per machine instruction, but constants, comments,
assembler directives, and symbolic labels of, e.g., memory locations, registers, and macros are
generally also supported.
Assembly code is converted into executable machine code by a utility program referred to as an
assembler. The term "assembler" is generally attributed to Wilkes, Wheeler, and Gill in their
1951 book The Preparation of Programs for an Electronic Digital Computer, who, however, used
the term to mean "a program that assembles another program consisting of several sections into a
single program". The conversion process is referred to as assembly, as in assembling the source
code. The computational step when an assembler is processing a program is called assembly
time. Assembly language may also be called symbolic machine code.
Computer Organization and Assembly Language
Because assembly depends on the machine code instructions, each assembly language is specific
to a particular computer architecture.
Sometimes there is more than one assembler for the same architecture, and sometimes an
assembler is specific to an operating system or particular operating systems.
Any given instruction is encoded in some format. Formats are listed in the yellow card. The first
two formats we see in the course are RR and RX. More formats will be seen later.
Each instruction has an opcode (1 byte long) identifying it. This is always the first byte in the
encoded instruction.
Notice that when we write assembly-language instructions, we normally write numbers (such as
displacements) in base 10, but when these are encoded, we end up with everything in base 16.
RR format: (instructions involving 2 registers)
8 bits of opcode
4 bits to identify the first register
4 bits to identify the second register
RX format: (instructions involving a register R and a D(X, B) address)
8 bits of opcode
4 bits to identify the register R
4 bits to identify the index register, X
4 bits to identify the base register B
12 bits giving the value of the displacement D
Arithmetic instructions in 8086 microprocessor
Arithmetic Instructions are the instructions that perform basic arithmetic operations such as
addition, subtraction, and a few more. Unlike in the 8085 microprocessor, in the 8086
microprocessor, the destination operand need not be the accumulator.
Following is the table showing the list of arithmetic instructions:
ADD AX,
ADD D, S D=D+S [2050]
SBB [2050],
SBB D, S D = D – S – prev. carry 0050
8 or 16-bit
IMUL register performs signed multiplication IMUL CX
Processor Architecture
The Word "architecture" typically refers to building design and construction. In the computing
world, "architecture" also refers to design, but instead of buildings, it describes the design of
computer systems. Computer architecture is a broad topic that includes everything from the
relationship between multiple computers (such as a "client-server" model) to specific
components inside a computer.
The Most important type of hardware design is a computer's processor architecture. The design
of the processor determines what software can run on the computer and what other hardware
components are supported. For example, Intel's x86 processor architecture is the standard
architecture used by most PCs. By using this design, computer manufacturers can create
machines that include different hardware components but run the same software. Several years
ago, Apple switched from the PowerPC architecture to the x86 architecture to make the
Macintosh platform more compatible with Windows PCs.
The Architecture of the motherboard is also important in determining what hardware and
software a computer system will support. The motherboard design is often called the "chipset"
and defines what processor models and other components will work with the motherboard. For
Computer Organization and Assembly Language
example, while two motherboards may both support x86 processors, one may only work with
newer processor models. A newer chipset may also require faster RAM and a different type of
video card than an older model.
A processor is made of transistors. The transistors are arranged in a sort of hardware-based
computer program that is designed to accept inputs and process them into outputs. The inputs are
machine code. Various tools make the task of producing machine code more user-friendly
(assemblers and compilers).
A certain class of CPUs is designed to process certain machine code - the x86 class. There are
many other hardware instruction sets like Itanium NEXT and Motorola 68xxx. They're not
compatible in the sense that a 68000 CPU can't understand x86 machine code.
Other architecture descriptions categorize how data is moved around inside the CPU. This
includes things like pre-fetch cues, parallel execution paths, stack operations, and caching. As
CPUs got faster, engineers had to design better ways to have them always have work available to
do and not simply sit around waiting. Different chips (even from the same maker) use different
techniques. It's like having TVs from five different makers. They all show the same picture, but
the circuit boards are very different.
Finally, architecture can also be used to describe the manufacturing process. This is 10nm
architecture versus 15nm architecture. As CPU speeds increase, the time it takes an electrical
signal to propagate across the face of the silicon, for example, becomes significant. You have to
design chips to account for that time delay. If two parts of a calculation propagate at different
speeds (due to more or fewer gates being involved), your architecture also has to account for
that. Making things smaller means less delay, but also more heat.
If you think about how different system boards are - and why - then you'll have some idea of
how the inside of a CPU (really, a collection of different functions tied together with glue logic)
can be the same way.
Von-Neumann (or stored-program computer) architecture
Von-Neumann architecture was one of the primitive architecture. At the time of its inventions,
computer programs were very small and simple and memory cost was very high. Under Von-
Neumann architecture, the program and data are stored in the same memory and are accessed on
the same bus. Each instruction is fetched (read from the memory), decoded, and executed.
During the decode state, any operands (if needed) are fetched from the same memory. Von-
Neumann Computers are also called stored program computers because instructions (or
programs) are stored in a ROM (Only Memory), which cannot be changed during run-time.
Harvard architecture
Harvard architecture is a modification of Von-Neumann's architecture. In Harvard architecture,
separate data paths (address and data buses) exist to access code (program) and data (data
Computer Organization and Assembly Language
operands). This makes it possible to fetch instructions and data at a time (on different buses).
Since instructions have a separate data path, the next instructions can be fetched while decoding
and executing the current instructions.
Harvard Architecture Derivatives
Some derivatives of Harvard architecture (e.g. modified Harvard and Super Harvard) have
multiple data paths for data access - such architectures are more suited for data-intensive
applications (such as digital signal processing) which require multiple data operands for each
instruction execution. Since these data operands can be fetched in parallel, a significant
performance improvement is achieved.
CISC (Complex Instruction Set Computer)
In the earlier days of computers, people coded their applications in Machine code or Assembly
code. There was no concept of high (or middle) level language. Writing codes in machine
language was a tedious process. To make programming easier and faster, computers supported a
large number of instructions. These instructions could do complex operations - a single
instruction could fetch one or more operands and do one or more operations on those operands.
This made the programming much easier as the programmer had to write less code (less number
of instructions) to achieve a given task. Another favorable factor for the advance of complex
instruction sets was the memory cost. Since the memories were very costly, designers wanted a
dense instruction set (to reduce the memory requirements).
RISC (Reduced Instruction Set)
Most complex instructions in CISC processors take many processor cycles to execute. In a
pipelined processor, the overall speed of the processor depends on the slowest operation being
performed. This means that the relatively complex instructions even slow down the execution of
simpler instructions. Thus complex instructions were a major performance bottleneck.
With the advent of compiler technology, programmers started using High (and middle) level
languages. It was the compiler’s task to transfer the high-level code to the assembly (or machine)
language code. Compilers generally use a combination of simple instructions to achieve complex
operations. It was observed that breaking the complex operations into a combination of simple
operations was much more efficient (took fewer processor cycles to execute) than doing the same
operations using a single complex instruction. Hence most of the complex instructions were not
being used by compiler-generated programs. Most of the addressing modes (offered by CISC)
were also not being used by compilers. This led to a shift in processor design philosophy.
Processor designers started focusing on reducing the size and complexity of instruction sets
(since most complex instructions were not being used) and making a small and simple instruction
set (which could be used by compilers) - this could help in two ways. First, simpler instructions
could speed up the pipeline and thus provide a performance improvement. Second, a simple
instruction set implies less computer hardware and thus reduced cost. Therefore the design goal
was to provide basic simple instructions, that could execute faster. Compilers could use these
instructions to construct complex operations.
Computer Organization and Assembly Language
Another interesting trend in this era (the early eighties) was a sharp increase in the speed of
processors (whereas the memory speeds remained comparatively low). This meant that memory
accesses were becoming a bottleneck. This led to the design of processors with a large number of
internal registers (which could be used for temporary storage rather than depending on external
and slower memory) and cache memories.
DSPs
Digital Signal Processors are special-purpose processors with processing units and instruction
sets tailored to suit the Signal Processing Applications. MAC (Multiply and Accumulate) and
Shifter (Arithmetic and Logical shift) units are added to the DSP cores since Signal Processing
Algorithms heavily depend on such operations. Circular Buffers, Bit Reversal Addressing,
Hardware Loops, and DAG (Digital Address Generators) are some other common features of a
DSP Architecture. Since Signal Processing Applications are data-intensive, the data I/O
bandwidth of these processors is designed to be high. In modern days, a lot of embedded systems
run signal-processing applications (cell phones, portable media players, etc).
(7) VLIW architecture
“Very Large Instruction Word” architecture consists of multiple ALUs in parallel. These
architectures have been designed to exploit the “Instruction Level Parallelism” in application.
Programmers can break their code such that each ALU can be loaded in parallel. The operation
to be done on each ALU (in a given cycle) forms the instruction word (for that cycle). It is
completely up to the programmer to take care of partitioning of the application across different
ALUs. Also, it is the programmer’s burden (and the compiler’s burden if the code is being
written in high-level language) to make sure that there is no interdependency between the
instructions with are part of the “instruction word”. The processor does not have any hardware to
ascertain (and reschedule) the order of instructions (this is called static scheduling).
(8) VLIW vs super scalar
Superscalar architectures are similar to VLIW architectures in the sense that they have multiple
ALUs. However, superscalar processors employ dynamic scheduling of instructions. These
architectures have special hardware to determine the interdependency of the instruction and
schedule their execution on different ALUs. The multiple ALUs are hidden from the
programmer. Programmers write (and compilers generate) their codes as if only one ALU is
available. The processor's internal hardware reschedules the instructions to exploit the ILP.
Programming for Super Scalar architectures in this way is much simpler than that on VLIW
architectures. However, it comes at an added cost of Hardware Complexity. The additional
hardware required for dynamic scheduling adds to both the cost and power consumption.
(9) SIMD
SIMD stands for “Single Instruction Multiple Data”. SIMD architectures have multiple ALUs.
However, in a single processor cycle, the same instruction (operation) needs to be executed on
all the ALUs. The data inputs to the ALU can be different. The SIMD processor thus executes
Computer Organization and Assembly Language
Same (Single) Instruction on different (Multiple) Data Inputs, in a given cycle. SIMD processors
exploit the “Data Level Parallelism” of an application.
(10) Multi-core architectures
A recent trend in processor design is Multi-core architectures. This architecture contains multiple
CPU cores (not multiple ALUs) on a single chip. This processor can exploit the “Thread-level
parallelism” of an application.
X86 Architecture
The x86 architecture is an instruction set architecture (ISA) series for computer processors.
Developed by Intel Corporation, x86 architecture defines how a processor handles and executes
different instructions passed from the operating system (OS) and software programs.
The “x” in x86 denotes the ISA version.
Designed in 1978, x86 architecture was one of the first ISAs for microprocessor-based
computing. Key features include:
Provides a logical framework for executing instructions through a processor
Allows software programs and instructions to run on any processor in the Intel 8086 family
Provides procedures for utilizing and managing the hardware components of a central processing
unit (CPU)
The x86 architecture primarily handles programmatic functions and provides services, such as
memory addressing, software and hardware interrupt handling, data type, registers, and
input/output (I/O) management.
It is an abstract model of a computer that is also referred to as computer architecture. It is part of
a computer that pertains to programming which specifies the behavior of machine code. The
instruction set is the language that a computer’s brain is designed to understand which provides
commands to the computer processor and tells it what to do.
The x86 is developed based on the Intel 8086 microprocessor and its 8088 variant where it
started as a 16-bit instruction set for 16-bit processors where many additions and extensions have
been added to the x86 where it grew to 32-bit instruction sets over the years with almost full
backward compatibility.
The bit in both 32-bit and 16-bit is shorthand for a number. For example, for 32-bit, the number
will contain 32 bits which are binary digits that are either 0 or 1. For a 32-bit number, it will look
something like this 10101010101010101010101010101010.
Today, the term x86 is used generally to refer to any 32-bit processor compatible with the x86
instruction set. x86 microprocessor is capable of running almost any type of computer from
laptops, servers, desktops, notebooks to supercomputers.
What is x64?
Computer Organization and Assembly Language
Similar to the x86, the x64 is also a family of instruction set architectures (ISA) for computer
processors. However, x64 refers to a 64-bit CPU and operating system instead of the 32-bit
system which the x86 stands for.
When the processor was first created, it was called 8086. The 8086 was well designed and
popular which can understand 16-bit machine language at first. It was later improved and
expanded the size of 8086 instructions to a 32-bit machine language. As they improved the
architecture, they kept 86 at the end of the model number, the 8086. This line of processors was
then known as the x86 architecture.
On the other hand, x64 is the architecture name for the extension to the x86 instruction set that
enables 64-bit code. When it was initially developed, it was named as x86-64. However, people
thought that the name was too length where it was later shortened to the current x64.
X86-64 EXTENDING IA32 TO 64 BITS
What is the difference between x86 and x64?
As you guys can already tell, the obvious difference will be the amount of bits of each operating
system. x86 refers to a 32-bit CPU and operating system while x64 refers to a 64-bit CPU and
operating system.
As mentioned above, the bits are shorthand for a number that can only be 1 or 0. This causes the
32-bit CPUs not to be able to use a lot of RAM as 1 and 0, the total number of combinations is
only 2^32 which equals 4,294,967,295. This means the 32-bit processor has 4.29 billion memory
locations each storing one byte of data which equates to approx. 4GB of memory which the 32-
bit processor can access without workarounds in software to address more.
Today, 4GB is enough for basic tasks but if you wish to run multiple programs and other more
heavy load tasks, 4GB is not sufficient. In addition, with a 64-bit system, it will be more efficient
as it can process data in 64-bit chunks compared to 32-bit chunks. Your 64-bit system can also
run 32-bit programs as they are backward compatible. But, it doesn’t work the other way where a
32-bit computer cannot run 64-bit programs.
ASSEMBLY LANGUAGE PROGRAMMING
What is Assembly Language?
Each personal computer has a microprocessor that manages the computer's arithmetical, logical,
and control activities.
Each family of processors has its own set of instructions for handling various operations such as
getting input from the keyboard, displaying information on screen, and performing various other
jobs. These sets of instructions are called 'machine language instructions'.
A processor understands only machine language instructions, which are strings of 1's and 0's.
However, machine language is too obscure and complex for use in software development. So,
Computer Organization and Assembly Language
the low-level assembly language is designed for a specific family of processors that represents
various instructions in symbolic code and a more understandable form.
Advantages of Assembly Language
Having an understanding of assembly language makes one aware of −
How programs interface with OS, processor, and BIOS;
How data is represented in memory and other external devices;
How the processor accesses and executes instructions;
How instructions access and process data;
How a program accesses external devices.
Other advantages of using assembly language are −
It requires less memory and execution time;
It allows hardware-specific complex jobs more easily;
It is suitable for time-critical jobs;
It is most suitable for writing interrupt service routines and other memory resident
programs.
Basic Features of PC Hardware
The main internal hardware of a PC consists of a processor, memory, and registers. Registers are
processor components that hold data and address. To execute a program, the system copies it
from the external device into the internal memory. The processor executes the program
instructions.
The fundamental unit of computer storage is a bit; it could be ON (1) or OFF (0) and a group of
8 related bits makes a byte on most of the modern computers.
So, the parity bit is used to make the number of bits in a byte odd. If the parity is even, the
system assumes that there had been a parity error (though rare), which might have been caused
by to hardware fault or electrical disturbance.
The processor supports the following data sizes −
Word: a 2-byte data item
Double word: a 4-byte (32-bit) data item
Quad word: an 8-byte (64-bit) data item
Paragraph: a 16-byte (128-bit) area
Kkilobyte: 1024 bytes
Computer Organization and Assembly Language
Bit value 1 1 1 1 1 1 1 1
Bit number 7 6 5 4 3 2 1 0
The value of a binary number is based on the presence of 1 bits and their positional value. So, the
value of a given binary number is −
1 + 2 + 4 + 8 +16 + 32 + 64 + 128 = 255
Which is the same as 28 - 1.
0 0 0
Computer Organization and Assembly Language
1 1 1
2 10 2
3 11 3
4 100 4
5 101 5
6 110 6
7 111 7
8 1000 8
9 1001 9
10 1010 A
11 1011 B
12 1100 C
13 1101 D
14 1110 E
15 1111 F
To convert a binary number to its hexadecimal equivalent, break it into groups of 4 consecutive
groups each, starting from the right, and write those groups over the corresponding digits of the
hexadecimal number.
Example − Binary number 1000 1100 1101 0001 is equivalent to hexadecimal - 8CD1
To convert a hexadecimal number to binary, just write each hexadecimal digit into its 4-digit
binary equivalent.
Example − Hexadecimal number FAD8 is equivalent to binary - 1111 1010 1101 1000
Binary Arithmetic
Computer Organization and Assembly Language
The following table illustrates four simple rules for binary addition −
0 1 1 1
+0 +0 +1 +1
=0 =1 =10 =11
Rules (iii) and (iv) show a carry of a 1-bit into the next left position.
Example
Decimal Binary
60 00111100
+42 00101010
102 01100110
A negative binary value is expressed in two's complement notation. According to this rule, to
convert a binary number to its negative value is to reverse its bit values and add 1.
Example
Number 53 00110101
Add 1 00000001
To subtract one value from another, convert the number being subtracted to two's complement
format and add the numbers.
Example
Subtract 42 from 53
Computer Organization and Assembly Language
Number 53 00110101
Number 42 00101010
Add 1 00000001
53 - 42 = 11 00001011
x: memory address
When the processor gets the numeric data from memory to register, it again reverses the bytes.
There are two kinds of memory addresses −
Computer Organization and Assembly Language
time diagram. For example, consider a processor having 4 stages and let there be 2 instructions to
be executed. We can visualize the execution sequence through the following space-time
diagrams:
Non overlapped execution:
Stage / Cycle 1 2 3 4 5 6 7 8
S1 I1 I2
S2 I1 I2
S3 I1 I2
S4 I1 I2
Overlapped execution:
Stage / Cycle 1 2 3 4 5
S1 I1 I2
S2 I1 I2
S3 I1 I2
S4 I1 I2
Computer Organization and Assembly Language
1. Arithmetic Pipeline
An arithmetic pipeline divides an arithmetic problem into various subproblems for
execution in various pipeline segments. It is used for floating point operations,
multiplication, and various other computations. The process or flowchart arithmetic
pipeline for floating-point addition is shown in the diagram. Floating point addition using
the arithmetic pipeline.
The following sub-operations are performed in this case:
mantissa get aligned. Finally, the addition of both numbers takes place followed by
normalization of the result in the last segment.
Example:
Let us consider two numbers,
X=0.3214*10^3 and Y=0.4500*10^2
Explanation:
First of all the two exponents are subtracted to give 3-2=1. This 3 becomes the exponent
of the result and the smaller exponent is shifted 1 times to the right to give
Y=0.0450*10^3
Finally, the two numbers are added to produce
Z=0.3664*10^3
As the result is already normalized the result remains the same.
2. Instruction Pipeline :
In this, a stream of instructions can be executed by overlapping the fetch, decode, and
execute phases of an instruction cycle. This type of technique is used to increase the
throughput of the computer system. An instruction pipeline reads instructions from the
memory while previous instructions are being executed in other segments of the pipeline.
Thus we can execute multiple instructions simultaneously. The pipeline will be more
efficient if the instruction cycle is divided into segments of equal duration.
In the most general case computer needs to process each instruction in the following
sequence of steps:
1. Fetch the instruction from memory (FI)
2. Decode the instruction (DA)
3. Calculate the effective address
4. Fetch the operands from memory (FO)
5. Execute the instruction (EX)
6. Store the result in the proper place
Let us see an example of an instruction pipeline.
Computer Organization and Assembly Language
Assembly Language
Each personal computer has a microprocessor that manages the computer's arithmetical, logical,
and control activities.
Each family of processors has its own set of instructions for handling various operations such as
getting input from the keyboard, displaying information on screen, and performing various other
jobs. These sets of instructions are called 'machine language instructions'.
A processor understands only machine language instructions, which are strings of 1's and 0's.
However, machine language is too obscure and complex for use in software development. So,
the low-level assembly language is designed for a specific family of processors that represents
various instructions in symbolic code and a more understandable form.
Advantages of Assembly Language
Having an understanding of assembly language makes one aware of −
How programs interface with OS, processor, and BIOS;
How data is represented in memory and other external devices;
How the processor accesses and executes instructions;
How instructions access and process data;
How a program accesses external devices.
Other advantages of using assembly language are −
Computer Organization and Assembly Language
Bit value 1 1 1 1 1 1 1 1
Computer Organization and Assembly Language
Bit number 7 6 5 4 3 2 1 0
The value of a binary number is based on the presence of 1 bit and its positional value. So, the
value of a given binary number is −
1 + 2 + 4 + 8 +16 + 32 + 64 + 128 = 255
which is the same as 28 - 1.
Hexadecimal Number System
The hexadecimal number system uses base 16. The digits in this system range from 0 to 15. By
convention, the letters A through F are used to represent the hexadecimal digits corresponding to
decimal values 10 through 15.
Hexadecimal numbers in computing are used for abbreviating lengthy binary representations.
The hexadecimal number system represents binary data by dividing each byte in half and
expressing the value of each half-byte. The following table provides the decimal, binary, and
hexadecimal equivalents −
0 0 0
1 1 1
2 10 2
3 11 3
4 100 4
5 101 5
6 110 6
7 111 7
Computer Organization and Assembly Language
8 1000 8
9 1001 9
10 1010 A
11 1011 B
12 1100 C
13 1101 D
14 1110 E
15 1111 F
To convert a binary number to its hexadecimal equivalent, break it into groups of 4 consecutive
groups each, starting from the right, and write those groups over the corresponding digits of the
hexadecimal number.
Example − Binary number 1000 1100 1101 0001 is equivalent to hexadecimal - 8CD1
To convert a hexadecimal number to binary, just write each hexadecimal digit into its 4-digit
binary equivalent.
Example − Hexadecimal number FAD8 is equivalent to binary - 1111 1010 1101 1000
Binary Arithmetic
The following table illustrates four simple rules for binary addition −
0 1 1 1
+0 +0 +1 +1
=0 =1 =10 =11
Computer Organization and Assembly Language
Rules (iii) and (iv) show a carry of a 1-bit into the next left position.
Example
Decimal Binary
60 00111100
+42 00101010
102 01100110
A negative binary value is expressed in two's complement notation. According to this rule, to
convert a binary number to its negative value is to reverse its bit values and add 1.
Example
Number 53 00110101
Add 1 00000001
To subtract one value from another, convert the number being subtracted to two's complement
format and add the numbers.
Example
Subtract 42 from 53
Number 53 00110101
Number 42 00101010
Add 1 00000001
53 - 42 = 11 00001011
x: memory address
When the processor gets the numeric data from memory to register, it again reverses the bytes.
There are two kinds of memory addresses −
Absolute address - a direct reference of a specific location.
Segment address (or offset) - starting address of a memory segment with the offset value.
Local Environment Setup
Computer Organization and Assembly Language
Assembly language is dependent upon the instruction set and the architecture of the processor. In
this tutorial, we focus on Intel-32 processors like Pentium. To follow this tutorial, you will need
−
An IBM PC or any equivalent compatible computer
A copy of the Linux operating system
A copy of the NASM assembler program
There are many good assembler programs, such as −
Microsoft Assembler (MASM)
Borland Turbo Assembler (TASM)
The GNU assembler (GAS)
We will use the NASM assembler, as it is −
Free. You can download it from various web sources.
Well documented and you will get lots of information on the net.
Could be used on both Linux and Windows.
Installing NASM
If you select "Development Tools" while installing Linux, you may get NASM installed along
with the Linux operating system and you do not need to download and install it separately. To
check whether you already have NASM installed, take the following steps −
Open a Linux terminal.
Type whereis nasm and press ENTER.
If it is already installed, then a line like nasm: /usr/bin/nasm appears. Otherwise, you will
see just nasm: then you need to install NASM.
To install NASM, take the following steps −
Check the netwide assembler (NASM) website for the latest version.
Download the Linux source archive nasm-X.XX.ta.gz, where X.XX is the NASM version
number in the archive.
Unpack the archive into a directory which creates a subdirectory nasm-X. XX.
cd to nasm-X.XX and type ./configure. This shell script will find the best C compiler to
use and set up Makefiles accordingly.
Type make to build the NASM and disarm binaries.
Computer Organization and Assembly Language
Type make install to install nasm and disarm in /usr/local/bin and to install the man
pages.
This should install NASM on your system. Alternatively, you can use an RPM distribution for
the Fedora Linux. This version is simpler to install, just double-click the RPM file.
Assembly - Basic Syntax
An assembly program can be divided into three sections −
The data section,
The bss section, and
The text section.
The data Section
The data section is used for declaring initialized data or constants. This data does not change at
runtime. You can declare various constant values, file names, buffer sizes, etc., in this section.
The syntax for declaring the data section is –
section. data
The bss Section
The bass section is used for declaring variables. The syntax for declaring bss section is −
section. bss
The text section
The text section is used for keeping the actual code. This section must begin with the
declaration global _start, which tells the kernel where the program execution begins.
The syntax for declaring text section is –
section. text
global _start
_start:
Comments
Assembly language comment begins with a semicolon (;). It may contain any printable character
including blank. It can appear on a line by itself, like −
; This program displays a message on the screen
Or, on the same line along with an instruction, like −
add eax, ebx; adds ebx to eax
Computer Organization and Assembly Language
Assembly – Registers
Processor operations mostly involve processing data. This data can be stored in memory and
accessed from thereon. However, reading data from and storing data in memory slows down the
processor, as it involves complicated processes of sending the data request across the control bus
and into the memory storage unit and getting the data through the same channel.
To speed up the processor operations, the processor includes some internal memory storage
locations, called registers.
Computer Organization and Assembly Language
The registers store data elements for processing without having to access the memory. A limited
number of registers are built into the processor chip.
Processor Registers
There are ten 32-bit and six 16-bit processor registers in IA-32 architecture. The registers are
grouped into three categories −
General registers,
Control registers, and
Segment registers.
The general registers are further divided into the following groups −
Data registers
Pointer registers
Index registers.
Data Registers
Four 32-bit data registers are used for arithmetic, logical, and other operations. These 32-bit
registers can be used in three ways −
As complete 32-bit data registers: EAX, EBX, ECX, EDX.
Lower halves of the 32-bit registers can be used as four 16-bit data registers: AX, BX,
CX, and DX.
Lower and higher halves of the above-mentioned four 16-bit registers can be used as
eight 8-bit data registers: AH, AL, BH, BL, CH, CL, DH, and DL.
Index Registers
The 32-bit index registers, ESI and EDI, and their 16-bit rightmost portions. SI and DI, are used
for indexed addressing and sometimes used in addition and subtraction. There are two sets of
index pointers −
Source Index (SI) − It is used as a source index for string operations.
Destination Index (DI) − It is used as a destination index for string operations.
Computer Organization and Assembly Language
Control Registers
The 32-bit instruction pointer registers and the 32-bit flags register combined are considered the
control registers.
Many instructions involve comparisons and mathematical calculations and change the status of
the flags and some other conditional instructions test the value of these status flags to take the
control flow to other locations.
The common flag bits are:
Overflow Flag (OF) − It indicates the overflow of a high-order bit (leftmost bit) of data
after a signed arithmetic operation.
Direction Flag (DF) − It determines the left or right direction for moving or comparing
string data. When the DF value is 0, the string operation takes a left-to-right direction and
when the value is set to 1, the string operation takes a right-to-left direction.
Interrupt Flag (IF) − It determines whether the external interrupts like keyboard entry,
etc., are to be ignored or processed. It disables the external interrupt when the value is 0
and enables interrupts when set to 1.
Trap Flag (TF) − It allows setting the operation of the processor in single-step mode.
The DEBUG program we used sets the trap flag, so we could step through the execution
one instruction at a time.
Sign Flag (SF) − It shows the sign of the result of an arithmetic operation. This flag is set
according to the sign of a data item following the arithmetic operation. The sign is
indicated by the high-order of the leftmost bit. A positive result clears the value of SF to
0 and a negative result sets it to 1.
Zero Flag (ZF) − It indicates the result of an arithmetic or comparison operation. A
nonzero result clears the zero flag to 0, and a zero result sets it to 1.
Auxiliary Carry Flag (AF) − It contains the carry from bit 3 to bit 4 following an
arithmetic operation; used for specialized arithmetic. The AF is set when a 1-byte
arithmetic operation causes a carry from bit 3 into bit 4.
Parity Flag (PF) − It indicates the total number of 1-bits in the result obtained from an
arithmetic operation. An even number of 1-bits clears the parity flag to 0 and an odd
number of 1-bits sets the parity flag to 1.
Computer Organization and Assembly Language
Carry Flag (CF) − It contains the carry of 0 or 1 from a high-order bit (leftmost) after an
arithmetic operation. It also stores the contents of the last bit of
a shift or rotation operation.
The following table indicates the position of flag bits in the 16-bit Flag register.
Flag: O D I T S Z A P C
But no: 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Segment Registers
Segments are specific areas defined in a program for containing data, code, and stack. There are
three main segments −
Code Segment − It contains all the instructions to be executed. A 16-bit Code Segment
register or CS register stores the starting address of the code segment.
Data Segment − It contains data, constants, and work areas. A 16-bit Data Segment
register or DS register stores the starting address of the data segment.
Stack Segment − It contains data and return addresses of procedures or subroutines. It is
implemented as a 'stack' data structure. The Stack Segment register or SS register stores
the starting address of the stack.
Apart from the DS, CS, and SS registers, there are other extra segment registers - ES (extra
segment), FS, and GS, which provide additional segments for storing data.
In assembly programming, a program needs to access the memory locations. All memory
locations within a segment are relative to the starting address of the segment. A segment begins
in an address evenly divisible by 16 or hexadecimal 10. So, the rightmost hex digit in all such
memory addresses is 0, which is not generally stored in the segment registers.
The segment registers store the starting addresses of a segment. To get the exact location of data
or instruction within a segment, an offset value (or displacement) is required. To reference any
memory location in a segment, the processor combines the segment address in the segment
register with the offset value of the location.
Assembly - Addressing Modes
Most assembly language instructions require operands to be processed. An operand address
provides the location, where the data to be processed is stored. Some instructions do not require
an operand, whereas some other instructions may require one, two, or three operands.
When an instruction requires two operands, the first operand is generally the destination, which
contains data in a register or memory location and the second operand is the source. The source
Computer Organization and Assembly Language
contains either the data to be delivered (immediate addressing) or the address (in register or
memory) of the data. Generally, the source data remains unaltered after the operation.
The three basic modes of addressing are −
Register Addressing
Immediate Addressing
Memory Addressing
Register Addressing
In this addressing mode, a register contains the operand. Depending upon the instruction, the
register may be the first operand, the second operand, or both.
For example,
MOV DX, TAX_RATE ; Register in first operand
MOV COUNT, CX; Register in the second operand
MOV EAX, EBX; Both the operands are in registers
As processing data between registers does not involve memory, it provides the fastest processing
of data.
Immediate Addressing
An immediate operand has a constant value or an expression. When an instruction with two
operands uses immediate addressing, the first operand may be a register or memory location, and
the second operand is an immediate constant. The first operand defines the length of the data.
For example,
BYTE_VALUE DB 150; A byte value is defined
WORD_VALUE DW 300; A word value is defined
ADD BYTE_VALUE, 65; an immediate operand 65 is added
MOV AX, 45H ; Immediate constant 45H is transferred to AX
Direct Memory Addressing
When operands are specified in memory addressing mode, direct access to the main memory,
usually to the data segment, is required. This way of addressing results in slower processing of
data. To locate the exact location of data in memory, we need the segment start address, which is
typically found in the DS register, and an offset value. This offset value is also called effective
address.
Computer Organization and Assembly Language
In direct addressing mode, the offset value is specified directly as part of the instruction, usually
indicated by the variable name. The assembler calculates the offset value and maintains a symbol
table, which stores the offset values of all the variables used in the program.
In direct memory addressing, one of the operands refers to a memory location and the other
operands references a register.
For example,
ADD BYTE_VALUE, DL; Adds the register in the memory location
MOV BX, WORD_VALUE; Operand from the memory is added to register
Direct-Offset Addressing
This addressing mode uses arithmetic operators to modify an address. For example, look at the
following definitions that define tables of data −
BYTE_TABLE DB 14, 15, 22, 45; Tables of bytes
WORD_TABLE DW 134, 345, 564, 123; Tables of words
The following operations access data from the tables in the memory into registers −
MOV CL, BYTE_TABLE [2] ; Gets the 3rd element of the BYTE_TABLE
MOV CL, BYTE_TABLE + 2 ; Gets the 3rd element of the BYTE_TABLE
MOV CX, WORD_TABLE [3] ; Gets the 4th element of the WORD_TABLE
MOV CX, WORD_TABLE + 3 ; Gets the 4th element of the WORD_TABLE
Indirect Memory Addressing
This addressing mode utilizes the computer's ability of Segment: Offset addressing. Generally,
the base registers EBX, EBP (or BX, BP), and the index registers (DI, SI), coded within square
brackets for memory references, are used for this purpose.
Indirect addressing is generally used for variables containing several elements like arrays. The
starting address of the array is stored in, say, the EBX register.
The following code snippet shows how to access different elements of the variable.
MY_TABLE TIMES 10 DW 0; Allocates 10 words (2 bytes) each initialized to 0
MOV EBX, [MY_TABLE] ; Effective Address of MY_TABLE in EBX
MOV [EBX], 110 ; MY_TABLE [0] = 110
ADD EBX, 2 ; EBX = EBX +2
MOV [EBX], 123 ; MY_TABLE [1] = 123
Computer Organization and Assembly Language
BYTE 1
WORD 2
DWORD 4
QWORD 8
Computer Organization and Assembly Language
TBYTE 10