TUTORIAL IN ASSEMBLY
❖ ASSEMBLY LANGUAGE PROGRAM PARTS
In TASM (Turbo Assembler), every assembly program is usually divided into several parts
(sections). Each part has a specific purpose in organizing code and data. Here are the main parts
of a typical TASM program.
Model Directive (.model small)
• Defines the memory model of the program (how code and data are arranged in memory).
• Common memory models: SMALL, MEDIUM, COMPACT, LARGE, HUGE.
• In simple TASM programs, we often use SMALL (one code segment and one data segment).
Stack Segment (.stack 100h)
• Allocates memory for the stack (temporary storage for procedures, parameters, return
addresses).
.stack 100h ; reserves 256 bytes for the stack
Data Segment (.data)
• Used to declare variables, constants, and messages.
• Can store initialized data (values defined before execution) or uninitialized data.
.data
msg1 DB 'Enter a number: $'
num1 DB ?
num2 DB ?
result DB ?
Code Segment (.code)
• Contains the actual instructions to be executed.
• Must start with a procedure (usually main).
• Ends with a return to DOS (INT 21h with AH=4Ch).
.code
main proc
; initialize ds (data segment register)
mov ax, @data
mov ds, ax
; your instructions here
; example: display message
mov ah, 9
lea dx, msg1
int 21h
; exit program
1
mov ah, 4ch
int 21h
main endp
end main
End Directive (end main)
• Marks the end of the program and specifies the entry point (usually MAIN).
Summary of TASM Program Parts:
1. .MODEL – Defines memory model.
2. .STACK – Allocates stack space.
3. .DATA – Stores variables and constants.
4. .CODE – Contains instructions and procedures.
5. END – Marks end of program and entry point.
❖ USING mov ax
In assembly language, particularly in x86 assembly,
MOV AX, ...
means move (copy) data into the AX register.
Let’s break it down:
• MOV → This is the instruction (mnemonic) for moving (copying) data from one location to
another.
• AX → This is a 16-bit general-purpose register in the CPU.
The MOV instruction doesn’t move in the sense of removing it from the source — it just copies
the value. The source remains unchanged.
Examples:
mov ax, 5 ; ax = 5
mov ax, bx ; ax = contents of bx register
mov ax, [1234h] ; ax = value stored at memory address 1234h
Important:
• AX specifically is the Accumulator Register (16-bit).
• In 32-bit mode, you’d see EAX (Extended AX).
• In 64-bit mode, it’s RAX.
Step by step how the CPU executes internally
mov ax, 5
This means: load the constant value 5 into the AX register.
2
Step 1: Fetch the instruction
a) The Program Counter (PC) (called IP = Instruction Pointer in x86 real mode) points to the
memory address of the instruction MOV AX, 5.
b) The CPU fetches the instruction bytes from memory into the Instruction Register (IR).
Step 2: Decode the instruction
a) The Control Unit (CU) interprets the instruction.
b) It sees the opcode MOV and understands the operands: destination is register AX, source is
an immediate value (5).
Step 3: Prepare operands
a) The CU sees that the source operand is an immediate value (a constant, not from memory or
another register).
b) The constant 5 is already part of the instruction in memory, so the CPU reads it directly from
the instruction stream.
Step 4: Execute (perform the operation)
a) The CPU sends control signals to the register file to load the value 5 into the AX register.
b) The value 5 is copied into AX.
So now:
ini
Copy code
AX = 0005h
Step 5: Write-back
a) The result (5) is stored in the destination register AX.
b) The source (the literal 5) is unchanged, since it’s just part of the instruction.
Step 6: Update Instruction Pointer
a) The CPU increments the Instruction Pointer (IP) to the next instruction in memory, ready to
fetch the next one.
End result:
Before: AX could hold any value.
After: AX = 5.
❖ USING mov al
mov al, ... is very similar to mov ax, ..., but instead of working with the full 16-bit AX register,
it only works with the lower 8 bits of AX.
Breaking it down:
• MOV → instruction to copy data.
• AL → the low 8-bit register inside AX.
3
In x86 architecture:
• AX is 16 bits.
• It’s split into two 8-bit parts:
o AH = high 8 bits (bits 8–15)
o AL = low 8 bits (bits 0–7)
ax = [ah | al]
[8b | 8b]
Example instructions:
mov al, 1 ; al = 1 (only the lower byte is affected, ah is unchanged)
mov al, bl ; al = contents of bl
mov al, [1234h] ; al = byte from memory address 1234h
So:
• mov ax, ... → affects the whole 16-bit register.
• mov al, ... → affects only the lower 8 bits of that register.
Example in action:
Suppose AX = 0x1234 (binary 0001001000110100).
• AH = 0x12
• AL = 0x34
If we do:
mov al, 0xFF
Then:
• ax = 0x12FF (only the low byte changed).
General Rule
• Use MOV AX when you need to move a 16-bit value.
• Use MOV AL when you only need to move an 8-bit value (the lower half of AX).
Situations for MOV AX
• Loading 16-bit immediate values:
mov ax, 1234h ; ax = 1234h
• Working with 16-bit memory or registers:
mov ax, bx ; copy 16 bits from bx into ax
mov ax, [2000h] ; load a 16-bit word from memory
• Arithmetic needing the full accumulator:
add ax, 1000h ; add 16-bit value
4
Situations for MOV AL
• Loading 8-bit immediate values:
mov al, 5 ; al = 05h
• Working with bytes (characters, ASCII, I/O ports, etc.):
mov al, 'a' ; al = 41h (ASCII for A)
• Copying only part of a register:
mov al, bl ; copy lower 8 bits of BX into AL
• String & I/O instructions often use AL:
in al, 60h ; read a byte from port 60h (keyboard)
out 20h, al ; write a byte to port 20h
Why two versions?
Because sometimes you only need a byte (AL) instead of a word (AX):
• Bytes are smaller (saves memory and time).
• Some hardware and instructions only work with 8-bit registers.
• It gives you flexibility — you can manipulate just part of AX (AL or AH) without touching the
whole thing.
Quick analogy:
Think of AX as a 2-digit number box.
• AL is the ones place (right digit).
• AH is the tens place (left digit).
If you only want to change the right digit, use AL.
If you want to change the whole number, use AX.
❖ USING mov ax, @data
What is @data?
• In TASM, when you declare a .DATA segment like this:
.data
num1 db 10
num2 db 20
the assembler assigns a segment address to that block of memory.
• @data is a special assembler symbol (not an instruction) that represents the starting
segment address of the data segment.
So, @data = the base address of the .DATA segment.
5
Why move it to AX?
mov ax, @data
• The CPU doesn’t automatically know where your .DATA variables are stored.
• To access them, you need to tell the DS (Data Segment register) where the .DATA segment is
located.
• But you cannot directly write into DS like this:
mov ds, @data ; Invalid (assembler will not allow this)
Because segment registers (like DS) cannot be loaded with immediate values directly.
Loading DS with @DATA
So, the correct way is a two-step process:
mov ax, @data ; load AX with the segment address of DATA
mov ds, ax ; copy AX into DS register
Now, DS points to your .DATA segment.
This means any instruction that accesses memory (like MOV AL, num1) will use DS as the
default base segment.
Example Program
Here’s a simple working TASM program:
.model small
.stack 100h
.data
msg db 'hello, world!$'
.code
main proc
mov ax, @data ; get address of data segment
mov ds, ax ; initialize ds with it
; display string
mov ah, 9
lea dx, msg
int 21h
; exit program
mov ah, 4ch
int 21h
main endp
end main
6
Summary
• @DATA - segment address of the .DATA segment.
• MOV AX, @DATA - loads that segment address into AX.
• MOV DS, AX - sets DS to point to .DATA.
This is necessary so the program can correctly access variables and strings stored in .DATA.
❖ USING main proc
• In TASM (Turbo Assembler), PROC stands for procedure.
• A procedure in assembly is like a function or subroutine in high-level languages (C, Java, etc.).
• MAIN PROC defines the starting point of the program — the main procedure.
When you write:
main proc
; instructions here
main endp
• MAIN is the name of the procedure.
• PROC marks the beginning of the procedure.
• ENDP marks the end of the procedure.
Why Do We Use MAIN PROC?
1. Organizes code
o Separates program logic into blocks (procedures).
o Just like functions in higher-level languages.
2. TASM requirement
o TASM expects a clear entry point for the program.
o Usually, we name it MAIN and then reference it at the bottom with:
o END MAIN
3. Supports modular programming
o You can define multiple procedures, for example:
▪ INPUT PROC … ENDP → for input.
▪ CALCULATE PROC … ENDP → for processing.
▪ DISPLAY PROC … ENDP → for output.
o Then, MAIN PROC calls them in order.
7
How Execution Works
1. When you run the program, DOS loads it into memory.
2. The CS (Code Segment) register points to the code segment.
3. The IP (Instruction Pointer) begins execution at the procedure defined after .CODE.
4. END MAIN tells the assembler that MAIN is the program entry point.
Example: Using MAIN PROC
Here’s a simple TASM program:
.model small
.stack 100h
.data
msg db 'hello, world!$'
.code
main proc
; initialize ds
mov ax, @data
mov ds, ax
; display string
mov ah, 9
lea dx, msg
int 21h
; exit program
mov ah, 4ch
int 21h
main endp
end main
Explanation:
• MAIN PROC → start of the program.
• Code between MAIN PROC and MAIN ENDP executes.
• END MAIN → tells the assembler that execution starts at MAIN.
8
Multiple Procedures Example
.model small
.stack 100h
.data
msg1 db 'in main$'
msg2 db 'in sub$'
.code
main proc
mov ax, @data
mov ds, ax
; call another procedure
call subproc
mov ah, 9
lea dx, msg1
int 21h
mov ah, 4ch
int 21h
main endp
subproc proc
mov ah, 9
lea dx, msg2
int 21h
ret
subproc endp
end main
Here:
• Execution starts in MAIN PROC.
• CALL SUBPROC jumps into another procedure.
• After finishing, it returns to MAIN.
Summary
• MAIN PROC defines the main procedure in a TASM program.
• PROC is the start of procedure, ENDP is the end of procedure.
• END MAIN specifies MAIN as the program’s entry point.
• Organizes code into blocks, making programs modular and structured.
9
❖ USING mov ah, 4ch
In TASM (Turbo Assembler), this instruction is used in DOS programming (using INT 21h services).
• MOV = Move data from one place to another.
• AH = The high 8-bit register of the AX register (Accumulator).
• 4CH = A hexadecimal constant (equivalent to decimal 76).
So,
mov ah, 4ch
means: Load the value 4Ch (76) into the AH register.
Why 4Ch
DOS provides many system services through interrupt 21h (INT 21h).
Each service is selected by putting a specific value in the AH register.
• When AH = 4Ch, it selects the "Terminate Program" service.
• This tells DOS that your program has finished and should return control to the operating
system.
How It’s Used
Typically, you’ll see this instruction near the end of a TASM program, like this:
mov ah, 4ch ; DOS function: terminate program
int 21h ; Call DOS interrupt
Here’s what happens step by step:
1. mov ah, 4ch → Selects DOS service 4Ch (Terminate Program).
2. int 21h → Executes the interrupt, and DOS ends the program gracefully.
Why Is It Important?
• Without this instruction, your program might not return properly to DOS.
• It’s the assembly-language equivalent of return 0; in C or exit(0) in higher-level languages.
• The return code (exit status) is usually placed in AL (low byte of AX). For example:
• MOV AX, 4C00H ; AH=4Ch (terminate), AL=00h (return code = 0)
• INT 21H
This is a common shortcut: load the whole AX register at once.
Summary:
• MOV AH, 4CH sets up DOS function 4Ch (program termination).
• It’s always paired with INT 21H at the end of a TASM program to properly exit and return
control to DOS, similar to ending a program in higher-level languages.
10
❖ USING mov ds, ax
• MOV = copy data from source to destination.
• DS = Data Segment Register (special register that points to the program’s data segment in
memory).
• AX = Accumulator Register (16-bit).
So,
mov ds, ax
means: Copy the contents of AX into the DS register.
Why Do We Need It?
In assembly (TASM), memory is divided into segments:
• CS → Code Segment (where instructions are stored).
• DS → Data Segment (where your variables/constants are stored).
• SS → Stack Segment (stack operations).
• ES, FS, GS → Extra segments (optional).
When your program starts, DS may not automatically point to your .DATA segment. If you try to access
variables without setting DS properly, you’ll likely get wrong values or runtime errors.
So, before using variables declared in the .DATA section, you must load the correct segment address
into DS.
How It’s Typically Used
You cannot directly load a segment register with an immediate value like:
mov ds, @data ; Illegal
Instead, you must first load the address of @DATA into a general-purpose register (like AX), then
move it into DS:
mov ax, @data ; AX ← address of data segment
mov ds, ax ; DS ← AX (now DS points to data segment)
Example:
.model small
.stack 100h
.data
msg db 'Hello, World!$'
.code
main proc
mov ax, @data ; Load data segment address into AX
mov ds, ax ; Initialize DS with data segment
; display the message
mov ah, 09h ; DOS service: display string
lea dx, msg ; Load address of string into DX
int 21h
11
; Exit program
mov ah, 4ch
int 21h
main endp
end main
Here’s what happens:
1. MOV AX, @DATA → AX gets the address of the .DATA segment.
2. MOV DS, AX → DS is initialized, so variables can be accessed.
3. DOS interrupt 21h (function 09h) uses DS:DX to find the string and print it.
Summary
• MOV DS, AX sets up the data segment register (DS) so your program can properly access
variables in the .DATA segment.
• Always done after MOV AX, @DATA.
• Essential in .MODEL SMALL (or similar models) because code and data segments are
separated.
❖ USING int 21h
What is INT 21h in TASM?
INT 21h is a DOS interrupt (software interrupt) that provides access to various operating system
services.
• INT - “Interrupt” instruction. It tells the CPU to pause the current program and execute a
predefined routine (an interrupt handler).
• 21h - The interrupt number (in hexadecimal). DOS uses interrupt 21h for its system
services.
So,
int 21h
means: Call DOS interrupt 21h handler → DOS provides the requested service.
How Does It Work?
Before calling INT 21h, you must tell DOS which service you want.
• The AH register holds the function number (service code).
• Other registers (like AL, BX, CX, DX, etc.) may hold parameters depending on the service.
• After the interrupt runs, some registers may return results (like success codes).
12
Common INT 21h Functions in TASM
Here are the most common ones you’ll use in simple assembly programs:
AH Value Function Description
01h Input a character Waits for a keypress and returns it in AL
02h Output a character Prints the character in DL
09h Display a string Prints a $-terminated string (DS:DX points to it)
0Ah Buffered input Reads a string of characters into a buffer
4Ch Terminate program Ends the program and returns control to DOS
Example:
1. Display a character:
mov ah, 02h ; function: display character
mov dl, 'a' ; character to display
int 21h ; Call DOS
2. Read a character:
mov ah, 01h ; function: read character
int 21h ; Call DOS
; The character is now in AL
3. Print a string:
.data
msg db 'hello, world!$'
.code
mov ah, 09h ; Function: Display string
lea dx, msg ; DX points to string
int 21h ; Call DOS
4. Exit program:
mov ah, 4ch ; function: terminate program
mov al, 00h ; return code (0 = success)
int 21h ; Call DOS
Why Is INT 21h Important?
• It’s the main way to interact with DOS (input, output, file handling, memory allocation, etc.).
• In TASM (and other 16-bit DOS assembly), it acts like a system call interface.
• Without INT 21h, your program couldn’t talk to the outside world — no printing, no input, no
file access.
Summary:
INT 21h is the gateway to DOS services in assembly language. You load a service number into AH
(and parameters into other registers), then call INT 21h. DOS executes the service and returns
control to your program.
13