Module Ii
Module Ii
ASSEMBLER:
In addition to the mnemonic machine instructions, we have used the following assembler directive: – START:
Specify name & starting address for the program.
– END: End of the program, specify the first executable instruction in the program.
– BYTE: Generate character or hexadecimal constant, occupying as many bytes as needed to represent the
constant.
– WORD: Generate one-word integer constant.
– RESB: Reserve no. of bytes for data area
– RESW (reserve no. of words for data area)
The instruction contains the opcode followed by a 12-bit displacement value. The range of displacement values
are from 0 -2048.
PROGRAM-COUNTER RELATIVE:
∙ The instruction contains the opcode followed by a 12-bit displacement value. The range of displacement
values are from 0 -2048.
∙ This displacement (should be small enough to fit in a 12-bit field) value is added to the current contents
of the program counter to get the target address of the operand required by the instruction. This is
relative way of calculating the address of the operand relative to the program counter. Hence the
displacement of the operand is relative to the current program counter value.
IMMEDIATE ADDRESSING
Convert the immediate operand to its internal representation and insert it into the instruction.
BASE RELATIVE ADDRESSING MODE:
∙ In this mode the base register is used to mention the displacement value. Therefore the target address is
TA = (base) + displacement value
∙ This addressing mode is used when the range of displacement value is not sufficient. ∙ Whenever this mode
is used it is indicated by using a directive BASE. The moment the assembler encounters this directive the
next instruction uses base-relative addressing mode to calculate the target address of the operand.
∙ When NOBASE directive is used then it indicates the base register is no more used to calculate the target
address of the operand. Assembler first chooses PC-relative, when the displacement field is not enough it
uses Base-relative.
Eg: 105F LDT LENGTH
ADDRESS OF LENGTH IS 0033
In this program it is defined as BASE LENGTH which is also 0033
TA = [B] + DISP
0033 = 0033+ DISP
DISP = 000
OPCODE of LDT is 74
OPCODE n i x b p e ADDRESS
PROGRAM RELOCATION
Sometimes it is required to load and run several programs at the same time. The system must be able to load
these programs wherever there is place in the memory. Therefore the exact starting is not known until the load
time.
Absolute program:
In the above instruction the address is mentioned during assembling itself. This is called Absolute Assembly.
55 101B LDA THREE 00102D
This statement says that the register A is loaded with the value stored at location 102D. Suppose it is decided
to load and execute the program at location 2000 instead of location 1000. Then at address 102D the required
value which needs to be loaded in the register A is no more available. The address also gets changed relative
to the displacement of the program. Hence we need to make some changes in the address portion of the
instruction so that we can load and execute the program at location 2000. Apart from the instruction which
will undergo a
change in their operand address value as the program load address changes.
There exist some parts in the program which will remain same regardless of where the program is being
loaded. Since assembler will not know actual location where the program will get loaded, it cannot make the
necessary changes in the addresses used in the program. However, the assembler identifies for the loader those
parts of the program which need modification. An object program that has the information necessary to
perform this kind of modification is called the relocatable program.
The above diagram shows the concept of relocation. Initially the program is loaded at location 0000. The
instruction JSUB is loaded at location 0006. The address field of this instruction contains 01036, which is the
address of the instruction labelled RDREC. The second figure shows that if the program is to be loaded at new
location 5000. The address of the instruction JSUB gets modified to new location 6036. Likewise the third
figure shows that if the program is relocated at location 7420, the JSUB instruction would need to be changed
to 4B108456 that correspond to the new address of RDREC.
The only part of the program that require modification at load time are those that specify direct addresses. The
rest of the instructions need not be modified. The instructions which doesn’t require modification are the ones
that is not a memory address (immediate addressing) and PC-relative, Base-relative instructions. From the
object program, it is not possible to distinguish the address and constant. The assembler must keep some
information to tell the loader. The object program that contains the modification record is called a relocatable
program.
For an address label, its address is assigned relative to the start of the program (START 0). The assembler
produces a Modification record to store the starting location and the length of the address field to be modified.
The command for the loader must also be a part of the object program. The Modification has the following
format:
Modification record
Col. 1 M
Col. 2-7 Starting location of the address field to be modified, relative to the beginning of the program
(Hex) Col. 8-9 Length of the address field to be modified, in half-bytes (Hex)
One modification record is created for each address to be modified The length is stored in half-bytes (4 bits)
The starting location is the location of the byte containing the leftmost bits of the address field to be modified.
If the field contains an odd number of half-bytes, the starting location begins in the middle of the first byte.
In the above object code the red boxes indicate the addresses that need modifications. The object code lines at
the end are the descriptions of the modification records for those instructions which need change if relocation
occurs. M00000705 is the modification suggested for the statement at location 0007 and requires modification
5-half bytes.
Literals:
Write the value of a constant operand as a aprt of the instruction that uses it. Such an operand is called literal
as the value is stated literally.
Literal is identified with prefix ‘=’
This avoids having to define the constant elsewhere in the program and make up a label for it.
Eg: This
is similar to
Difference between a constant defined as a literal and a constant defined as an immediate operand.
Immediate Operands
The operand value is assembled as part of the machine instruction
e.g. 55 0020 LDA #3 010003
Literals
The assembler generates the specified value as a constant at some other memory location
e.g. 45 001A ENDFIL LDA =C’EOF’ 032010
∙ All the literal operands used in a program are gathered together into one or more literal pools. ∙ Normally
literals are placed at the end of the program. In some cases it is placed at some other location in the object
program.
∙ Whenever the assembler directive LTORG is encountered, it creates a literal pool that contains all the
literal operands used since the beginning of the program. It is better to place the literals close to the
instructions.
∙ A literal table is created for the literals which are used in the program. The literal table contains the
literal name, operand value and length.
IMPLEMENTATION OF LITERALS:
During Pass-1:
∙ The literal encountered is searched in the literal table. If the literal already exists, no action is taken; if it
is not present, the literal is added to the LITTAB and for the address value it waits till it encounters
LTORG for literal definition.
∙ When Pass 1 encounters a LTORG statement each literal currently in the table is assigned an address.
During Pass-2:
∙ The assembler searches the LITTAB for each literal encountered in the instruction and replaces it with
its equivalent value as if these values are generated by BYTE or WORD.
∙ If a literal represents an address in the program, the assembler must generate a modification relocation
for, if it all it gets affected due to relocation. The following figure shows the difference between the
SYMTAB and LITTAB
These statements will cause the symbols A, X, L… to be entered into the symbol table with their respective
values.
∙ As another usage if in a machine that has many general purpose registers named as R1, R2,…, some may
be used as base register, some may be used as accumulator. Their usage may change from one program
to another. In this case we can define these requirement using EQU statements. BASE EQU R1
INDEX EQU R2
COUNT EQU R3
∙ One restriction with the usage of EQU is whatever symbol occurs in the right hand side of the EQU
should be predefined. For example, the following statement is not valid:
BETA EQU ALPHA
ALPHA RESW 1
∙ As the symbol ALPHA is assigned to BETA before it is defined. The value of ALPHA is not known.
ORG Statement:
This directive can be used to indirectly assign values to the symbols. The directive is usually called ORG (for
origin). Its general format is:
ORG value
Where value is a constant or an expression involving constants and previously defined symbols. When this
statement is encountered during assembly of a program, the assembler resets its location counter (LOCCTR) to
the specified value.
Since the values of symbols used as labels are taken from LOCCTR, the ORG statement will affect the values
of all labels defined until the next ORG is encountered. ORG is used to control assignment storage in the
object program. Sometimes altering the values may result in incorrect assembly.
Suppose we need to define a symbol table with the following structure:
SYMBOL 6 Bytes
VALUE 3 Bytes
FLAG 2 Bytes
Eg: The space for the table can be reserved by the statement:
STAB RESB 1100
If we want to refer to the entries of the table using indexed addressing, place the offset value of the desired
entry from the beginning of the table in the index register. To refer to the fields SYMBOL, VALUE, and
FLAGS individually, we need to assign the values first as shown below:
SYMBOL EQU STAB
VALUE EQU STAB+6
FLAGS EQU STAB+9
The same thing can also be done using ORG statement in the following way:
STAB RESB 1100
ORG STAB
SYMBOL RESB 6
VALUE RESW 1
FLAG RESB 2
ORG STAB+1100
The first statement allocates 1100 bytes of memory assigned to label STAB. In the second statement the ORG
statement initializes the location counter to the value of STAB. Now the LOCCTR points to STAB. The next
three lines assign appropriate memory storage to each of SYMBOL, VALUE and FLAG symbols. The last
ORG statement reinitializes the LOCCTR to a new value after skipping the required number of memory for
the table STAB (i.e., STAB+1100).
EXPRESSIONS:
∙ Assemblers also allow use of expressions in place of operands in the instruction. ∙ Each such expression
must be evaluated to generate a single operand value or address. ∙ Assemblers generally arithmetic
expressions formed according to the normal rules using arithmetic
operators +, - *, /. Division is usually defined to produce an integer result. Individual terms may be
constants, user-defined symbols, or special terms.
∙ The only special term used is * (the current value of location counter) which indicates the value of the
next unassigned memory location.
∙ Thus the statement
BUFFEND EQU *
Assigns a value to BUFFEND, which is the address of the next byte following the buffer area. ∙
Expressions are classified as either absolute expression or relative expressions depending on the type of
value they produce.
NOTE:
If the result of the expression is an absolute value (constant) then it is known as absolute
expression.
Eg: BUFEND – BUFFER
If the result of the expression is relative to the beginning of the program then it is known as
relative expression label on instructions and data areas and references to the location counter
values are relative terms.
Eg: BUFEND + BUFFER
1) Absolute Expressions: The expression that uses only absolute terms is absolute expression. Absolute
expression may contain relative term provided the relative terms occur in pairs with opposite signs for
each pair.
Eg: MAXLEN EQU BUFEND-BUFFER
Both BUFFEND and BUFFER are relative terms. The expression represents absolute value; the difference
between the two addresses.
In the above instruction the difference in the expression gives a value that does not depend on the location
of the program and hence gives an absolute immaterial of the relocation of the program. The expression
can have only absolute terms.
Eg: MAXLEN EQU 1000
2) Relative Expressions: All the relative terms except one can be paired as described in “absolute”. The
remaining unpaired relative term must have a positive sign.
Eg: STAB EQU OPTAB + (BUFEND – BUFFER)
Handling the type of expressions: to find the type of expression, we must keep track the type of symbols
used. This can be achieved by defining the type in the symbol table against each of the symbol as shown
in the table below:
Program Blocks:
Program blocks allow the generated machine instructions and data to appear in the object program in a
different order by Separating blocks for storing code, data, stack, and larger data block.
Assembler Directive USE:
USE [blockname]
At the beginning, statements are assumed to be part of the unnamed (default) block. If no USE statements are
included, the entire program belongs to this single block. Each program block may actually contain several
separate segments of the source program. Assemblers rearrange these segments to gather together the pieces of
each block and assign address. Separate the program into blocks in a particular order. Large buffer area is
moved to the end of the object program. Program readability is better if data areas are placed in the source
program close to the statements that reference them.
CONTROL SECTION:
A control section is a part of the program that maintains its identity after assembly; each control section can be
loaded and relocated independently of the others.
Different control sections are most often used for subroutines or other logical subdivisions. The programmer
can assemble, load, and manipulate each of these control sections separately. Because of this there should be
some means for linking control sections together.
For example, instructions in one control section may refer to the data or instructions of other control sections.
Since control sections are independently loaded and relocated, the assembler is unable to process these
references in the usual way. Such references between different control sections are called external references.
When a program is written using multiple control sections, the beginning of each of the control section is
indicated by an assembler directive “CSECT”
Syntax:
secname CSECT
– separate location counter for each control section
Control sections differ from program blocks in that they are handled separately by the assembler. Symbols that
are defined in one control section may not be used directly another control section; they must be identified as
external reference for the loader to handle. The external references are indicated by two assembler directives:
EXTDEF (external Definition):
It is the statement in a control section, names symbols that are defined in this section but may be used by other
control sections. Control section names do not need to be named in the EXTREF as they are automatically
considered as external symbols.
A define record gives information about the external symbols that are defined in this control section, i.e.,
symbols named by EXTDEF.
A refer record lists the symbols that are used as external references by the control section, i.e., symbols
named by EXTREF.
The new items in the modification record specify the modification to be performed: adding or subtracting the
value of some external symbol. The symbol used for modification can be defined either in this control section
or in another section.
OBJECT CODE:
In the case of Define, the record also indicates the relative address of each external symbol within the control
section.
For EXTREF symbols, no address information is available. These symbols are simply named in the Refer
record.
ASSEMBLER DESIGN:
Assembler design are of two types: Single pass assembler and two pass assembler.
Load-and-Go Assembler
∙ Load-and-go assembler generates their object code in memory for immediate execution. ∙
No object program is written out, no loader is needed.
∙ It is useful in a system with frequent program development and testing
∙ The efficiency of the assembly process is an important consideration.
∙ Programs are re-assembled nearly every time they are run; efficiency of the assembly process is an
important consideration.
The status after scanning line 160, which has encountered the definition of
RDREC and ENDFIL is as given below:
A Macro represents a commonly used group of statements in the source programming language.
∙ A macro instruction (macro) is a notational convenience for the programmer
▪ It allows the programmer to write shorthand version of a program (module programming)
∙ The macro processor replaces each macro instruction with the corresponding group of source language
statements (expanding).
▪ Normally, it performs no analysis of the text it handles.
▪ It does not concern the meaning of the involved statements during macro expansion. ∙
body
:
MEND
▪ Body: the statements that will be generated as the expansion of the macro.
▪ The left block shows the MACRO definition and the right block shows the expanded macro replacing the
MACRO call with its block of executable instruction.
▪ M1 is a macro with two parameters D1 and D2. The MACRO stores the contents of register A in D1 and
the contents of register B in D2. Later M1 is invoked with the parameters DATA1 and DATA2, Second
time with DATA4 and DATA3. Every call of MACRO is expanded with the executable statements.
The statement M1
DATA1, DATA2 is a macro invocation statements that gives the name of the macro instruction being invoked
and the arguments (M1 and M2) to be used in expanding. A macro invocation is referred as a Macro Call or
Invocation.
Macro Expansion:
∙ The program with macros is supplied to the macro processor. Each macro invocation statement will be
expanded into the statements that form the body of the macro, with the arguments from the macro
invocation substituted for the parameters in the macro prototype. During the expansion, the macro
definition statements are deleted since they are no longer needed.
∙ The arguments and the parameters are associated with one another according to their positions. The first
argument in the macro matches with the first parameter in the macro prototype and so on.
∙ After macro processing the expanded file can become the input for the Assembler. The Macro Invocation
statement is considered as comments and the statement generated from expansion is treated exactly as
though they had been written directly by the programmer.
∙ The difference between Macros and Subroutines is that the statements from the body of the Macro is
expanded the number of times the macro invocation is encountered, whereas the statement of the
subroutine appears only once no matter how many times the subroutine is called. Macro instructions
will be written so that the body of the macro contains no labels.
There will be duplicate labels, which will be treated as errors by the assembler.
∙ Solutions:
▪ JLT *-14
Eg:
Macro Processor System Software Tools
Macro processor System software tools are used to process programs inside computer system and that makes
efficient execution of application software. The following are the basic tool
1. Translator
2. Assembler
3. Compiler
4. Interpreter
5. Pre processor
6. Linker
7. Loader
1. Translator
• A Translator is a system program that converts a program in one language to a program in another language. •
T = Target Language
2. Assembler
• Assembler is a language translator which takes as input a program in assembly language of machine A and
3. Compiler
• A Compiler is a language translator that takes as input a source program in some HLL and converts it into a lower
• So, an HLL program is first compiled to generate an object file with machine-level instructions (i.e. compile time)
and then instructions in object file are executed (i.e. run time).
4. Interpreters
• An Interpreter is similar to a compiler, but one big difference is that it executes each line of source code as soon
as its equivalent machine code is generated. (This approach is different from a compiler, which compiles the
entire source code into an object file that is executed separately).
• If there are any errors during interpretation, they are notified immediately to the programmer and remaining
5. Pre-processors
of white spaces and comments) before the actual translation process can begin.
• So, input and output of pre-processor is generally the same HLL, only some additional functions have been
6. Linker
• A Linker (or a Linkage Editor) takes the object file, loads and compiles the external subroutines from the library
• A Compiler generates an object file after compiling the source code. But this object file cannot be executed
• This is because the main program may use separate subroutines in its code (locally defined in the program or
available globally as language subroutines). The external subroutines have not been compiled with the main
program and therefore their addresses are not known in the program.
7. Loaders
• A Loader does the job of coordinating with the OS to get the initial loading address for the program, prepares the
program for execution (i.e. generates an .exe file) and loads it at that address.
• Also, during the course of its execution, a program may be relocated to a different area of main memory by the
An important job of Loader is to modify these address-sensitive instructions, so that they run correctly after
relocation.