Assembly Language
An assembly language is a low-level programming language designed for a specific
type of processor. It may be produced by compiling source code from a high-level
programming language (such as C/C++) but can also be written from scratch. Assembly
code can be converted to machine code using an assembler.
Since most compilers convert source code directly to machine code, software
developers often create programs without using assembly language. However, in some
cases, assembly code can be used to fine-tune a program. For example, a programmer
may write a specific process in assembly language to make sure it functions as
efficiently as possible.
While assembly languages differ between processor architectures, they often
include similar instructions and operators. Below are some examples of instructions
supported by x86 processors.
MOV - move data from one location to another
ADD - add two values
SUB - subtract a value from another value
PUSH - push data onto a stack
POP - pop data from a stack
JMP - jump to another location
INT - interrupt a process
The following assembly language can be used to add the numbers 3 and 4:
mov eax, 3 - loads 3 into the register "eax"
mov ebx, 4 - loads 4 into the register "ebx"
add eax, ebx, ecx - adds "eax" and "ebx" and stores the result (7) in "ecx"
Writing assembly language is a tedious process since each operation must be
performed at a very basic level. While it may not be necessary to use assembly code to
create a computer program, learning assembly language is often part of a Computer
Science curriculum since it provides useful insight into the way processors work.
List of Useful and Frequently Used DOS Command
This list of DOS commands is very useful when repairing Windows after a system
crash when Windows doesn’t load and the only option you have is a Dos command
prompt. Use the “help” command to find the usage and details of any particular
command e.g. C:\>help copy
CHDIR – Displays the name of or changes the current directory.
CHKDSK – Checks a disk and displays a status report.
CLS – Clears the screen.
COMP – Compares two groups of files to find information that does not match.
COPY – Copies and appends files.
DATE – Displays and/or sets the system date.
DEFRAG – Optimizes disk performance by reorganizing the files on the disk.
DEL – Deletes files from disk.
DELTREE – Deletes a directory including all files and subdirectories that are in it.
DIR – Displays directory of files and directories stored on disk.
DISKCOMP – Compares the contents of two diskettes.
ECHO – Displays messages or turns on or off the display of commands in a batch file.
EDIT – Starts the MS-DOS editor, a text editor used to create and edit ASCII text files.
EXIT – Exits a secondary command processor.
EXPAND – Expands a compressed file.
FASTHELP – Displays a list of DOS commands with a brief explanation of each.
FIND – Finds and reports the location of a specific string of text characters in one or
more files.
FOR – Performs repeated execution of commands (for both batch processing and
interactive processing).
FORMAT – Formats a disk to accept DOS files.
GRAPHICS – Provides a way to print contents of a graphics screen display.
IF – Allows for conditional operations in batch processing.
LABEL – Creates or changes or deletes a volume label for a disk.
MEM – Displays amount of installed and available memory, including extended,
expanded, and upper memory.
MKDIR – Creates a new subdirectory.
MORE – Sends output to console, one screen at a time.
MOVE – Moves one or more files to the location you specify. Can also be used to
rename directories.
PATH – Sets or displays directories that will be searched for programs not in the
current directory.
RENAME – Changes the filename under which a file is stored.
RMDIR – Removes a subdirectory.
SORT – Sorts input and sends it to the screen or to a file.
XCOPY – Copies directories, subdirectories, and files.
Assembler
An assembler is a program that converts assembly language into machine code. It takes
the basic commands and operations from assembly code and converts them
into binary code that can be recognized by a specific type of processor.
Assemblers are similar to compilers in that they produce executable code. However,
assemblers are more simplistic since they only convert low-level code (assembly
language) to machine code. Since each assembly language is designed for a specific
processor, assembling a program is performed using a simple one-to-one mapping from
assembly code to machine code. Compilers, on the other hand, must convert generic
high-level source code into machine code for a specific processor.
Most programs are written in high-level programming languages and are compiled
directly to machine code using a compiler. However, in some cases, assembly code may
be used to customize functions and ensure they perform in a specific way.
Therefore, IDEs often include assemblers so they can build programs from both high
and low-level languages.
Preprocessor Directives
Preprocessor directives are lines included in a program that begin with the
character #, which make them different from a typical source code text. They are
invoked by the compiler to process some programs before compilation. Preprocessor
directives change the text of the source code and the result is a new source code without
these directives.
Although preprocessing in C# is conceptually similar to that in C/C++, it is different
in two aspects. First, preprocessing in C# does not involve a separate step for
preprocessor execution before compilation. It is processed as a part of the lexical
analysis phase. Second, it cannot be used to create macros. In addition, the new
directives #region and #unregion have been added in C# along with the exclusion of
some directives used earlier (#include is a notable directive whose use is replaced with
"using" to include assemblies).
Java does not support preprocessor directives.
A preprocessor directive is usually placed in the top of the source code in a separate
line beginning with the character "#", followed by directive name and an optional white
space before and after it. Because a comment on the same line of declaration of the
preprocessor directive has to be used and cannot scroll through the following line,
delimited comments cannot be used. A preprocessor directive statement must not end
with a semicolon (;). Preprocessor directives can be defined in source code or in the
common line as argument during compilation.
Examples for preprocessing directives that can be used in C# include:
#define and #undef: To define and undefine conditional compilation
symbols, respectively. These symbols could be checked during compilation
and the required section of source code can be compiled. The scope of a
symbol is the file in which it is defined.
#if, #elif, #else, and #endif: To skip part of source code based on conditions.
Conditional sections may be nested with directives forming complete sets.
#line: To control line numbers generated for errors and warning. This is
mostly used by meta-programming tools to generate C# source code from
some text input. It is generally used to modify the line numbers and source
file names reported by the compiler in its output.
#error and #warning : To generate errors and warnings, respectively.
#error is used to stop compilation, while #warning is used to continue
compilation with messages in the console.
#region and #endregion :To explicitly mark sections of source code. These
allow expansion and collapse inside Visual Studio for better readability and
reference.
Original 8086/8088 instruction set
Instruct
Meaning Notes Opcode
ion
ASCII adjust AL after used with unpacked binary coded
AAA 0x37
addition decimal
8086/8088 datasheet documents
only base 10 version of the AAD
instruction (opcode 0xD5 0x0A),
but any other base will work.
Later Intel's documentation has
ASCII adjust AX before
AAD the generic form too. NEC V20 0xD5
division
and V30 (and possibly other NEC
V-series CPUs) always use base
10, and ignore the argument,
causing a number of
incompatibilities
Original 8086/8088 instruction set
Instruct
Meaning Notes Opcode
ion
Only base 10 version (Operand is
ASCII adjust AX after
AAM 0xA) is documented, see notes 0xD4
multiplication
for AAD
ASCII adjust AL after
AAS 0x3F
subtraction
0x10…0x15,
destination := destination +
ADC Add with carry 0x80/2…0x
source + carry_flag
83/2
0x00…0x05,
(1) r/m += r/imm; (2) r +=
ADD Add 0x80/0…0x
m/imm;
83/0
0x20…0x25,
(1) r/m &= r/imm; (2) r &=
AND Logical AND 0x80/4…0x
m/imm;
83/4
0x9A, 0xE8,
push eip; eip points to the
CALL Call procedure 0xFF/2,
instruction directly after the call
0xFF/3
CBW Convert byte to word 0x98
CLC Clear carry flag CF = 0; 0xF8
CLD Clear direction flag DF = 0; 0xFC
CLI Clear interrupt flag IF = 0; 0xFA
Original 8086/8088 instruction set
Instruct
Meaning Notes Opcode
ion
CMC Complement carry flag 0xF5
0x38…0x3D,
CMP Compare operands 0x80/7…0x
83/7
Compare bytes in
CMPSB 0xA6
memory
CMPSW Compare words 0xA7
Convert word to
CWD 0x99
doubleword
Decimal adjust AL after (used with packed binary coded
DAA 0x27
addition decimal)
Decimal adjust AL after
DAS 0x2F
subtraction
0x48…0x4F,
DEC Decrement by 1 0xFE/1,
0xFF/1
DX:AX = DX:AX / 0xF6/6,
DIV Unsigned divide
r/m; resulting DX == remainder 0xF7/6
Used
ESC 0xD8..0xDF
with floating-point unit
Original 8086/8088 instruction set
Instruct
Meaning Notes Opcode
ion
HLT Enter halt state 0xF4
DX:AX = DX:AX / 0xF6/7,
IDIV Signed divide
r/m; resulting DX == remainder 0xF7/7
0x69, 0x6B
(both since
80186),
(1) DX:AX = AX * r/m; (2) AX = 0xF6/5,
IMUL Signed multiply
AL * r/m 0xF7/5,
0x0FAF
(since
80386)
(1) AL = port[imm]; (2) AL =
0xE4, 0xE5,
IN Input from port port[DX]; (3) AX =
0xEC, 0xED
port[imm]; (4) AX = port[DX];
0x40…0x47,
INC Increment by 1 0xFE/0,
0xFF/0
INT Call to interrupt 0xCC, 0xCD
Call to interrupt if
INTO 0xCE
overflow
IRET Return from interrupt 0xCF
Original 8086/8088 instruction set
Instruct
Meaning Notes Opcode
ion
(JA, JAE, JB, JBE, JC, JE, JG,
0x70…0x7F,
JGE, JL, JLE, JNA, JNAE, JNB,
0x0F80…0x
Jcc Jump if condition JNBE, JNC, JNE, JNG, JNGE,
0F8F (since
JNL, JNLE, JNO, JNP, JNS, JNZ,
80386)
JO, JP, JPE, JPO, JS, JZ)
JCXZ Jump if CX is zero 0xE3
0xE9…0xEB
JMP Jump , 0xFF/4,
0xFF/5
Load FLAGS into AH
LAHF 0x9F
register
LDS Load pointer using DS 0xC5
LEA Load Effective Address 0x8D
LES Load ES with pointer 0xC4
Assert BUS LOCK#
LOCK (for multiprocessing) 0xF0
signal
if (DF==0) AL = *SI++; else AL =
LODSB Load string byte 0xAC
*SI--;
if (DF==0) AX = *SI++; else AX
LODSW Load string word 0xAD
= *SI--;
Original 8086/8088 instruction set
Instruct
Meaning Notes Opcode
ion
LOOP/LO (LOOPE, LOOPNE, LOOPNZ,
Loop control 0xE0…0xE2
OPx LOOPZ) if (x && --CX) goto lbl;
copies data from one location to
MOV Move 0xA0...0xA3
another, (1) r/m = r; (2) r = r/m;
if (DF==0)
Move byte from string *(byte*)DI++ = *(byte*)SI++;
MOVSB 0xA4
to string else
*(byte*)DI-- = *(byte*)SI--;
if (DF==0)
Move word from string *(word*)DI++ =
MOVSW 0xA5
to string *(word*)SI++; else
*(word*)DI-- = *(word*)SI--;
(1) DX:AX = AX * r/m; (2) AX = 0xF6/4…0x
MUL Unsigned multiply
AL * r/m; F7/4
Two's complement 0xF6/3…0x
NEG r/m *= -1;
negation F7/3
opcode equivalent to XCHG EAX,
NOP No operation 0x90
EAX
Negate the 0xF6/2…0x
NOT r/m ^= -1;
operand, logical NOT F7/2
Original 8086/8088 instruction set
Instruct
Meaning Notes Opcode
ion
0x08…0x0D
(1) r/m |= r/imm; (2) r |= m/im ,
OR Logical OR
m; 0x80…0x83
/1
(1) port[imm] = AL; (2) port[DX]
0xE6, 0xE7,
OUT Output to port = AL; (3) port[imm] =
0xEE, 0xEF
AX; (4) port[DX] = AX;
0x07,
r/m = *SP++; POP CS (opcode 0x0F(8086/
0x0F) works only on 8086/8088. 8088 only),
POP Pop data from stack
Later CPUs use 0x0F as a prefix 0x17, 0x1F,
for newer instructions. 0x58…0x5F,
0x8F/0
Pop FLAGS
POPF FLAGS = *SP++; 0x9D
register from stack
0x06, 0x0E,
0x16, 0x1E,
0x50…0x57,
PUSH Push data onto stack *--SP = r/m; 0x68, 0x6A
(both since
80186),
0xFF/6
PUSHF Push FLAGS onto stack *--SP = FLAGS; 0x9C
Original 8086/8088 instruction set
Instruct
Meaning Notes Opcode
ion
0xC0…0xC1
/2 (since
RCL Rotate left (with carry) 80186),
0xD0…0xD3
/2
0xC0…0xC1
/3 (since
Rotate right (with
RCR 80186),
carry)
0xD0…0xD3
/3
Repeat
(REP, REPE, REPNE, REPNZ,
REPxx MOVS/STOS/CMPS/L 0xF2, 0xF3
REPZ)
ODS/SCAS
Not a real instruction. The
assembler will translate these to a
RET Return from procedure RETN or a RETF depending on
the memory model of the target
system.
Return from near
RETN 0xC2, 0xC3
procedure
Return from far
RETF 0xCA, 0xCB
procedure
ROL Rotate left 0xC0…0xC1
Original 8086/8088 instruction set
Instruct
Meaning Notes Opcode
ion
/0 (since
80186),
0xD0…0xD3
/0
0xC0…0xC1
/1 (since
ROR Rotate right 80186),
0xD0…0xD3
/1
SAHF Store AH into FLAGS 0x9E
0xC0…0xC1
/4 (since
Shift Arithmetically left
SAL (1) r/m <<= 1; (2) r/m <<= CL; 80186),
(signed shift left)
0xD0…0xD3
/4
0xC0…0xC1
/7 (since
Shift Arithmetically (1) (signed) r/m >>=
SAR 80186),
right (signed shift right) 1; (2) (signed) r/m >>= CL;
0xD0…0xD3
/7
alternative 1-byte encoding 0x18…0x1D,
Subtraction with
SBB of SBB AL, AL is available 0x80…0x83
borrow
via undocumented SALC /3
Original 8086/8088 instruction set
Instruct
Meaning Notes Opcode
ion
instruction
SCASB Compare byte string 0xAE
SCASW Compare word string 0xAF
0xC0…0xC1
/4 (since
Shift left (unsigned shift
SHL 80186),
left)
0xD0…0xD3
/4
0xC0…0xC1
/5 (since
Shift right (unsigned
SHR 80186),
shift right)
0xD0…0xD3
/5
STC Set carry flag CF = 1; 0xF9
STD Set direction flag DF = 1; 0xFD
STI Set interrupt flag IF = 1; 0xFB
if (DF==0) *ES:DI++ = AL; else *
STOSB Store byte in string 0xAA
ES:DI-- = AL;
if (DF==0) *ES:DI++ = AX; else
STOSW Store word in string 0xAB
*ES:DI-- = AX;
Original 8086/8088 instruction set
Instruct
Meaning Notes Opcode
ion
0x28…0x2D,
(1) r/m -= r/imm; (2) r -=
SUB Subtraction 0x80…0x83
m/imm;
/5
0x84, 0x84,
0xA8, 0xA9,
TEST Logical compare (AND) (1) r/m & r/imm; (2) r & m/imm;
0xF6/0,
0xF7/0
Waits until BUSY# pin is inactive
WAIT Wait until not busy 0x9B
(used with floating-point unit)
r :=: r/m; A spinlock typically
0x86, 0x87,
XCHG Exchange data uses xchg as an atomic operation.
0x91…0x97
(coma bug).
Table look-up
XLAT behaves like MOV AL, [BX+AL] 0xD7
translation
0x30…0x35,
(1) r/m ^= r/imm; (2) r ^=
XOR Exclusive OR 0x80…0x83
m/imm;
/6