0% found this document useful (0 votes)
14 views24 pages

Module Ii

This document provides an overview of assembler directives and machine-independent features related to assembly language programming. It covers instruction formats, addressing modes, program relocation, literals, and symbol-defining statements like EQU and ORG. Additionally, it explains how to manage memory allocation and the use of expressions in assembly language.

Uploaded by

arathysp18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views24 pages

Module Ii

This document provides an overview of assembler directives and machine-independent features related to assembly language programming. It covers instruction formats, addressing modes, program relocation, literals, and symbol-defining statements like EQU and ORG. Additionally, it explains how to manage memory allocation and the use of expressions in assembly language.

Uploaded by

arathysp18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

MODULE II

ASSEMBLER:
In addition to the mnemonic machine instructions, we have used the following assembler directive: – START:
Specify name & starting address for the program.
– END: End of the program, specify the first executable instruction in the program.
– BYTE: Generate character or hexadecimal constant, occupying as many bytes as needed to represent the
constant.
– WORD: Generate one-word integer constant.
– RESB: Reserve no. of bytes for data area
– RESW (reserve no. of words for data area)

MACHINE INDEPENDENT ASSEMBLER FEATURES


∙ Instruction formats and addressing modes
∙ Program relocation

Instruction formats and addressing modes:


∙ The instruction formats depend on the memory organization and the size of the memory. ∙ Memory of a
SIC/XE machine is 2^20 bytes (1 MEGABYTE).
∙ Registers are A,X,L,PC,SW,B,S,T,F
∙ This supports four different types of instruction types, they are:
▪ Format 1: 8 bit OPCODE (RSUB)
▪ Format 2: 16 bit OPCODE(8bit) R1(4), R2(4) eg: COMPR A,S ▪ Format 3: 24bit OPCODE(6bit) n i x b p e
displacement(12 bit)
▪ Format 4: 32 bit OPCODE(6bit) n i x b p e address(20bit)
- If e bit =0, then format 3, otherwise format 4.
- If p bit =1, then PC relative addressing
- If b bit =1, then base relative addressing
- If x bit =1, then index register is used.
- If i bit=1, then immediate addressing.
∙ Addressing Modes:
In SIC/XE Register to register instructions execution speed is faster than Register to memory
instructions: 1. Translations for the Instruction involving Register-Register addressing mode:
During pass 1 the registers can be entered as part of the symbol table itself. The value for these
registers is their equivalent numeric codes.
During pass 2, these values are assembled along with the mnemonics object code. If required a
separate table can be created with the register names and their equivalent numeric values.

Eg: COMPR A, S => A004


Register name (A, X, L, B, S, T, F, PC, SW) and their values (0,1, 2, 3, 4, 5, 6, 8, 9)
In OPTAB, opcode of COMPR is A0.

2. Translation involving Register-Memory instructions:


∙ Among the instruction formats, format -3 and format-4 instructions are Register-Memory type of
instruction. One of the operand is always in a register and the other operand is in the memory. ∙ There are
two ways: Program-counter relative and Base-relative. This addressing mode can be represented by either
using format-3 type or format-4 type of instruction format.
∙ In format-3, the instruction has the opcode followed by a 12-bit displacement value in the address field.
In format-4 the instruction contains the mnemonic code followed by a 20-bit displacement value in the
address field.
▪ Format 3: 24bit OPCODE(6bit) n i x b p e displacement(12 bit)
▪ Format 4: 32 bit OPCODE(6bit) n i x b p e address(20bit)

The instruction contains the opcode followed by a 12-bit displacement value. The range of displacement values
are from 0 -2048.
PROGRAM-COUNTER RELATIVE:
∙ The instruction contains the opcode followed by a 12-bit displacement value. The range of displacement
values are from 0 -2048.
∙ This displacement (should be small enough to fit in a 12-bit field) value is added to the current contents
of the program counter to get the target address of the operand required by the instruction. This is
relative way of calculating the address of the operand relative to the program counter. Hence the
displacement of the operand is relative to the current program counter value.

∙ n=1, i = 1: indicate neither indirect nor immediate addressing


∙ p = 1: indicate PC-relative addressing

IMMEDIATE ADDRESSING
Convert the immediate operand to its internal representation and insert it into the instruction.
BASE RELATIVE ADDRESSING MODE:
∙ In this mode the base register is used to mention the displacement value. Therefore the target address is
TA = (base) + displacement value
∙ This addressing mode is used when the range of displacement value is not sufficient. ∙ Whenever this mode
is used it is indicated by using a directive BASE. The moment the assembler encounters this directive the
next instruction uses base-relative addressing mode to calculate the target address of the operand.
∙ When NOBASE directive is used then it indicates the base register is no more used to calculate the target
address of the operand. Assembler first chooses PC-relative, when the displacement field is not enough it
uses Base-relative.
Eg: 105F LDT LENGTH
ADDRESS OF LENGTH IS 0033
In this program it is defined as BASE LENGTH which is also 0033
TA = [B] + DISP
0033 = 0033+ DISP
DISP = 000
OPCODE of LDT is 74
OPCODE n i x b p e ADDRESS

0111 01 1 1 0 1 0 0 0000 0000 0000


The object code is 774000

PROGRAM RELOCATION
Sometimes it is required to load and run several programs at the same time. The system must be able to load
these programs wherever there is place in the memory. Therefore the exact starting is not known until the load
time.
Absolute program:
In the above instruction the address is mentioned during assembling itself. This is called Absolute Assembly.
55 101B LDA THREE 00102D
This statement says that the register A is loaded with the value stored at location 102D. Suppose it is decided
to load and execute the program at location 2000 instead of location 1000. Then at address 102D the required
value which needs to be loaded in the register A is no more available. The address also gets changed relative
to the displacement of the program. Hence we need to make some changes in the address portion of the
instruction so that we can load and execute the program at location 2000. Apart from the instruction which
will undergo a
change in their operand address value as the program load address changes.

There exist some parts in the program which will remain same regardless of where the program is being
loaded. Since assembler will not know actual location where the program will get loaded, it cannot make the
necessary changes in the addresses used in the program. However, the assembler identifies for the loader those
parts of the program which need modification. An object program that has the information necessary to
perform this kind of modification is called the relocatable program.

The above diagram shows the concept of relocation. Initially the program is loaded at location 0000. The
instruction JSUB is loaded at location 0006. The address field of this instruction contains 01036, which is the
address of the instruction labelled RDREC. The second figure shows that if the program is to be loaded at new
location 5000. The address of the instruction JSUB gets modified to new location 6036. Likewise the third
figure shows that if the program is relocated at location 7420, the JSUB instruction would need to be changed
to 4B108456 that correspond to the new address of RDREC.

The only part of the program that require modification at load time are those that specify direct addresses. The
rest of the instructions need not be modified. The instructions which doesn’t require modification are the ones
that is not a memory address (immediate addressing) and PC-relative, Base-relative instructions. From the
object program, it is not possible to distinguish the address and constant. The assembler must keep some
information to tell the loader. The object program that contains the modification record is called a relocatable
program.
For an address label, its address is assigned relative to the start of the program (START 0). The assembler
produces a Modification record to store the starting location and the length of the address field to be modified.
The command for the loader must also be a part of the object program. The Modification has the following
format:
Modification record
Col. 1 M
Col. 2-7 Starting location of the address field to be modified, relative to the beginning of the program
(Hex) Col. 8-9 Length of the address field to be modified, in half-bytes (Hex)

One modification record is created for each address to be modified The length is stored in half-bytes (4 bits)
The starting location is the location of the byte containing the leftmost bits of the address field to be modified.
If the field contains an odd number of half-bytes, the starting location begins in the middle of the first byte.

In the above object code the red boxes indicate the addresses that need modifications. The object code lines at
the end are the descriptions of the modification records for those instructions which need change if relocation
occurs. M00000705 is the modification suggested for the statement at location 0007 and requires modification
5-half bytes.

MACHINE INDEPENDENT ASSEMBLER FEATURES:


These are the features which do not depend on the architecture of the machine.
These are:

Literals:
Write the value of a constant operand as a aprt of the instruction that uses it. Such an operand is called literal
as the value is stated literally.
Literal is identified with prefix ‘=’
This avoids having to define the constant elsewhere in the program and make up a label for it.

Eg: This
is similar to

Difference between a constant defined as a literal and a constant defined as an immediate operand.
Immediate Operands
The operand value is assembled as part of the machine instruction
e.g. 55 0020 LDA #3 010003
Literals
The assembler generates the specified value as a constant at some other memory location
e.g. 45 001A ENDFIL LDA =C’EOF’ 032010

∙ All the literal operands used in a program are gathered together into one or more literal pools. ∙ Normally
literals are placed at the end of the program. In some cases it is placed at some other location in the object
program.
∙ Whenever the assembler directive LTORG is encountered, it creates a literal pool that contains all the
literal operands used since the beginning of the program. It is better to place the literals close to the
instructions.
∙ A literal table is created for the literals which are used in the program. The literal table contains the
literal name, operand value and length.

IMPLEMENTATION OF LITERALS:
During Pass-1:
∙ The literal encountered is searched in the literal table. If the literal already exists, no action is taken; if it
is not present, the literal is added to the LITTAB and for the address value it waits till it encounters
LTORG for literal definition.
∙ When Pass 1 encounters a LTORG statement each literal currently in the table is assigned an address.
During Pass-2:
∙ The assembler searches the LITTAB for each literal encountered in the instruction and replaces it with
its equivalent value as if these values are generated by BYTE or WORD.
∙ If a literal represents an address in the program, the assembler must generate a modification relocation
for, if it all it gets affected due to relocation. The following figure shows the difference between the
SYMTAB and LITTAB

SYMBOL- DEFINING STATEMENTS:


EQU STATEMENT:
Most assemblers provide an assembler directive that allows the programmer to define symbols and specify
their values. The directive used for this EQU (Equate).
The general form of the statement is
Symbol EQU value
The value can be a constant or an expression involving constants and any other symbol which is already
defined. Eg: +LDT #4096
This loads the register T with immediate value 4096, this does not clearly what exactly this value
indicates. If a statement is included as:
MAXLEN EQU 4096
And then +LDT #MAXLEN
This indicates that the value of MAXLEN is some maximum length value. When the assembler encounters
EQU statement, it enters the symbol MAXLEN along with its value in the symbol table.
During LDT the assembler searches the SYMTAB for its entry and its equivalent value as the operand in the
instruction.
∙ Another common usage of EQU statement is for defining values for the general-purpose registers. The
assembler can use the mnemonics for register usage like a-register A, X – index register and so on. The
programmer can assign the numerical values to these registers using EQU directive.

These statements will cause the symbols A, X, L… to be entered into the symbol table with their respective
values.
∙ As another usage if in a machine that has many general purpose registers named as R1, R2,…, some may
be used as base register, some may be used as accumulator. Their usage may change from one program
to another. In this case we can define these requirement using EQU statements. BASE EQU R1
INDEX EQU R2
COUNT EQU R3
∙ One restriction with the usage of EQU is whatever symbol occurs in the right hand side of the EQU
should be predefined. For example, the following statement is not valid:
BETA EQU ALPHA
ALPHA RESW 1
∙ As the symbol ALPHA is assigned to BETA before it is defined. The value of ALPHA is not known.

ORG Statement:
This directive can be used to indirectly assign values to the symbols. The directive is usually called ORG (for
origin). Its general format is:
ORG value
Where value is a constant or an expression involving constants and previously defined symbols. When this
statement is encountered during assembly of a program, the assembler resets its location counter (LOCCTR) to
the specified value.
Since the values of symbols used as labels are taken from LOCCTR, the ORG statement will affect the values
of all labels defined until the next ORG is encountered. ORG is used to control assignment storage in the
object program. Sometimes altering the values may result in incorrect assembly.
Suppose we need to define a symbol table with the following structure:
SYMBOL 6 Bytes
VALUE 3 Bytes
FLAG 2 Bytes
Eg: The space for the table can be reserved by the statement:
STAB RESB 1100
If we want to refer to the entries of the table using indexed addressing, place the offset value of the desired
entry from the beginning of the table in the index register. To refer to the fields SYMBOL, VALUE, and
FLAGS individually, we need to assign the values first as shown below:
SYMBOL EQU STAB
VALUE EQU STAB+6
FLAGS EQU STAB+9

The same thing can also be done using ORG statement in the following way:
STAB RESB 1100
ORG STAB
SYMBOL RESB 6
VALUE RESW 1
FLAG RESB 2
ORG STAB+1100

The first statement allocates 1100 bytes of memory assigned to label STAB. In the second statement the ORG
statement initializes the location counter to the value of STAB. Now the LOCCTR points to STAB. The next
three lines assign appropriate memory storage to each of SYMBOL, VALUE and FLAG symbols. The last
ORG statement reinitializes the LOCCTR to a new value after skipping the required number of memory for
the table STAB (i.e., STAB+1100).

EXPRESSIONS:
∙ Assemblers also allow use of expressions in place of operands in the instruction. ∙ Each such expression
must be evaluated to generate a single operand value or address. ∙ Assemblers generally arithmetic
expressions formed according to the normal rules using arithmetic
operators +, - *, /. Division is usually defined to produce an integer result. Individual terms may be
constants, user-defined symbols, or special terms.
∙ The only special term used is * (the current value of location counter) which indicates the value of the
next unassigned memory location.
∙ Thus the statement
BUFFEND EQU *
Assigns a value to BUFFEND, which is the address of the next byte following the buffer area. ∙
Expressions are classified as either absolute expression or relative expressions depending on the type of
value they produce.

NOTE:
If the result of the expression is an absolute value (constant) then it is known as absolute
expression.
Eg: BUFEND – BUFFER

If the result of the expression is relative to the beginning of the program then it is known as
relative expression label on instructions and data areas and references to the location counter
values are relative terms.
Eg: BUFEND + BUFFER
1) Absolute Expressions: The expression that uses only absolute terms is absolute expression. Absolute
expression may contain relative term provided the relative terms occur in pairs with opposite signs for
each pair.
Eg: MAXLEN EQU BUFEND-BUFFER
Both BUFFEND and BUFFER are relative terms. The expression represents absolute value; the difference
between the two addresses.
In the above instruction the difference in the expression gives a value that does not depend on the location
of the program and hence gives an absolute immaterial of the relocation of the program. The expression
can have only absolute terms.
Eg: MAXLEN EQU 1000
2) Relative Expressions: All the relative terms except one can be paired as described in “absolute”. The
remaining unpaired relative term must have a positive sign.
Eg: STAB EQU OPTAB + (BUFEND – BUFFER)
Handling the type of expressions: to find the type of expression, we must keep track the type of symbols
used. This can be achieved by defining the type in the symbol table against each of the symbol as shown
in the table below:
Program Blocks:
Program blocks allow the generated machine instructions and data to appear in the object program in a
different order by Separating blocks for storing code, data, stack, and larger data block.
Assembler Directive USE:
USE [blockname]
At the beginning, statements are assumed to be part of the unnamed (default) block. If no USE statements are
included, the entire program belongs to this single block. Each program block may actually contain several
separate segments of the source program. Assemblers rearrange these segments to gather together the pieces of
each block and assign address. Separate the program into blocks in a particular order. Large buffer area is
moved to the end of the object program. Program readability is better if data areas are placed in the source
program close to the statements that reference them.

In the example below three blocks are used :


Default: executable instructions
CDATA: all data areas that are less in length
CBLKS: all data areas that consists of larger blocks of memory
Arranging code into program blocks:
Pass 1
• A separate location counter for each program block is maintained.
• Save and restore LOCCTR when switching between blocks.
• At the beginning of a block, LOCCTR is set to 0.
• Assign each label an address relative to the start of the block.
• Store the block name or number in the SYMTAB along with the assigned relative address of the label •
Indicate the block length as the latest value of LOCCTR for each block at the end of Pass1 • Assign to each
block a starting address in the object program by concatenating the program blocks in a particular order
Pass 2
• Calculate the address for each symbol relative to the start of the object program by adding
∙ The location of the symbol relative to the start of its block
∙ The starting address of this block

CONTROL SECTION:

A control section is a part of the program that maintains its identity after assembly; each control section can be
loaded and relocated independently of the others.

Different control sections are most often used for subroutines or other logical subdivisions. The programmer
can assemble, load, and manipulate each of these control sections separately. Because of this there should be
some means for linking control sections together.

For example, instructions in one control section may refer to the data or instructions of other control sections.
Since control sections are independently loaded and relocated, the assembler is unable to process these
references in the usual way. Such references between different control sections are called external references.

When a program is written using multiple control sections, the beginning of each of the control section is
indicated by an assembler directive “CSECT”

Syntax:
secname CSECT
– separate location counter for each control section
Control sections differ from program blocks in that they are handled separately by the assembler. Symbols that
are defined in one control section may not be used directly another control section; they must be identified as
external reference for the loader to handle. The external references are indicated by two assembler directives:
EXTDEF (external Definition):
It is the statement in a control section, names symbols that are defined in this section but may be used by other
control sections. Control section names do not need to be named in the EXTREF as they are automatically
considered as external symbols.

EXTREF (external Reference):


It names symbols that are used in this section but are defined in some other control section. The order in which
these symbols are listed is not significant. The assembler must include proper information about the external
references in the object program that will cause the loader to insert the proper value where they are required.
Eg:
The assembler must also include information in the object program that will cause the loader to insert the
proper value where they are required. The assembler maintains two new record in the object code and a
changed version of modification record.
Define record (EXTDEF)
• Col. 1 D
• Col. 2-7 Name of external symbol defined in this control section
• Col. 8-13 Relative address within this control section (hexadecimal)
• Col.14-73 Repeat information in Col. 2-13 for other external symbols

Refer record (EXTREF)


• Col. 1 R
• Col. 2-7 Name of external symbol referred to in this control section
• Col. 8-73 Name of other external reference symbols
Modification record
• Col. 1 M
• Col. 2-7 Starting address of the field to be modified (hexadecimal)
• Col. 8-9 Length of the field to be modified, in half-bytes (hexadecimal)
• Col.11-16 External symbol whose value is to be added to or subtracted from the indicated field

A define record gives information about the external symbols that are defined in this control section, i.e.,
symbols named by EXTDEF.
A refer record lists the symbols that are used as external references by the control section, i.e., symbols
named by EXTREF.
The new items in the modification record specify the modification to be performed: adding or subtracting the
value of some external symbol. The symbol used for modification can be defined either in this control section
or in another section.
OBJECT CODE:
In the case of Define, the record also indicates the relative address of each external symbol within the control
section.
For EXTREF symbols, no address information is available. These symbols are simply named in the Refer
record.

ASSEMBLER DESIGN:
Assembler design are of two types: Single pass assembler and two pass assembler.

Single pass assembler


o It is necessary or desirable to avoid a second pass over the source program
o the external storage for the intermediate file between two passes is slow or is inconvenient to use
∙ Main problem: forward references to both data and instructions
∙ One simple way to eliminate this problem: Eliminating forward reference to data items, by defining all
the storage reservation statements at the beginning of the program rather at the end.
∙ Unfortunately, forward reference to labels on the instructions cannot be avoided. (forward jumping) ∙
There are two types of one-pass assemblers:
▪ One that produces object code directly in memory for immediate execution (Load-and-go
assemblers).
▪ The other type produces the usual kind of object code for later execution.

Load-and-Go Assembler
∙ Load-and-go assembler generates their object code in memory for immediate execution. ∙
No object program is written out, no loader is needed.
∙ It is useful in a system with frequent program development and testing
∙ The efficiency of the assembly process is an important consideration.
∙ Programs are re-assembled nearly every time they are run; efficiency of the assembly process is an
important consideration.

Forward Reference in One-Pass Assemblers: In load-and-Go assemblers when a forward reference is


encountered:
• Omits the operand address if the symbol has not yet been defined
• Enters this undefined symbol into SYMTAB and indicates that it is undefined
• Add the address of this operand address to a list of forward references associated with the SYMTAB
entry
• When the definition for the symbol is encountered, scans the reference list and inserts the address. •
At the end of the program, reports the error if there are still SYMTAB entries indicated undefined
symbols.
• For Load-and-Go assembler
o Search SYMTAB for the symbol named in the END statement and jumps to this location to
begin execution if there is no error
After Scanning line 40 of the program:
40 2021 J` CLOOP 302012
The status is that upto this point the symbol RREC is referred once at location 2013, ENDFIL at 201F and
WRREC at location 201C. None of these symbols are defined. The figure shows that how the pending
definitions along with their addresses are included in the symbol table.

The status after scanning line 160, which has encountered the definition of
RDREC and ENDFIL is as given below:

If One-Pass needs to generate object code:


• If the operand contains an undefined symbol, use 0 as the address and write the Text record to the object
program.
• Forward references are entered into lists as in the load-and-go assembler.
• When the definition of a symbol is encountered, the assembler generates another Text record with the correct
operand address of each entry in the reference list.
• When loaded, the incorrect address 0 will be updated by the latter Text record containing the symbol
definition.
Object Code Generated by One-Pass Assembler:
Multi Pass Assembler:
• For a two pass assembler, forward references in symbol definition are not
allowed: ALPHA EQU BETA
BETA EQU DELTA
DELTA RESW 1
o Symbol definition must be completed in pass 1.
• Prohibiting forward references in symbol definition is not a serious inconvenience. o
Forward references tend to create difficulty for a person reading the program.

Implementation Issues for Modified Two-Pass Assembler:


Implementation Issues when forward referencing is encountered in Symbol Defining statements:
• For a forward reference in symbol definition, we store in the SYMTAB:
o The symbol name
o The defining expression
o The number of undefined symbols in the defining expression
• The undefined symbol (marked with a flag *) associated with a list of symbols depend on this undefined
symbol.
• When a symbol is defined, we can recursively evaluate the symbol expressions depending on the newly
defined symbol.
MACROPROCESOR:

A Macro represents a commonly used group of statements in the source programming language.
∙ A macro instruction (macro) is a notational convenience for the programmer
▪ It allows the programmer to write shorthand version of a program (module programming)
∙ The macro processor replaces each macro instruction with the corresponding group of source language
statements (expanding).
▪ Normally, it performs no analysis of the text it handles.
▪ It does not concern the meaning of the involved statements during macro expansion. ∙

The design of a macro processor generally is machine independent!

∙ Two new assembler directives are used in macro definition

▪ MACRO: identify the beginning of a macro definition

▪ MEND: identify the end of a macro definition

∙ Prototype for the macro

▪ Each parameter begins with ‘&’

Syntax: macroname MACRO parameters

body
:
MEND

▪ Body: the statements that will be generated as the expansion of the macro.

BASIC MACRO PROCESSOR FUNCTIONS:


▪ Macro Definition and Expansion
▪ Macro processor Algorithms and Data Structures.

Macro Definition and Expansion:

▪ The left block shows the MACRO definition and the right block shows the expanded macro replacing the
MACRO call with its block of executable instruction.
▪ M1 is a macro with two parameters D1 and D2. The MACRO stores the contents of register A in D1 and
the contents of register B in D2. Later M1 is invoked with the parameters DATA1 and DATA2, Second
time with DATA4 and DATA3. Every call of MACRO is expanded with the executable statements.

The statement M1
DATA1, DATA2 is a macro invocation statements that gives the name of the macro instruction being invoked
and the arguments (M1 and M2) to be used in expanding. A macro invocation is referred as a Macro Call or
Invocation.

Macro Expansion:
∙ The program with macros is supplied to the macro processor. Each macro invocation statement will be
expanded into the statements that form the body of the macro, with the arguments from the macro
invocation substituted for the parameters in the macro prototype. During the expansion, the macro
definition statements are deleted since they are no longer needed.

∙ The arguments and the parameters are associated with one another according to their positions. The first
argument in the macro matches with the first parameter in the macro prototype and so on.

∙ After macro processing the expanded file can become the input for the Assembler. The Macro Invocation
statement is considered as comments and the statement generated from expansion is treated exactly as
though they had been written directly by the programmer.

∙ The difference between Macros and Subroutines is that the statements from the body of the Macro is
expanded the number of times the macro invocation is encountered, whereas the statement of the
subroutine appears only once no matter how many times the subroutine is called. Macro instructions
will be written so that the body of the macro contains no labels.

∙ Problem of the label in the body of macro:


▪ If the same macro is expanded multiple times at different places in the program … ▪

There will be duplicate labels, which will be treated as errors by the assembler.

∙ Solutions:

▪ Do not use labels in the body of macro.

▪ Explicitly use PC-relative addressing instead.


∙ Ex, in RDBUFF and WRBUFF macros,
▪ JEQ *+11

▪ JLT *-14

∙ It is inconvenient and error-prone.

Eg:
Macro Processor System Software Tools
Macro processor System software tools are used to process programs inside computer system and that makes
efficient execution of application software. The following are the basic tool

1. Translator

2. Assembler
3. Compiler

4. Interpreter

5. Pre processor

6. Linker

7. Loader

1. Translator
• A Translator is a system program that converts a program in one language to a program in another language. •

A Translator can be denoted by following symbol: STTBSTBT

Where S = Source Language

T = Target Language

B = Base Language in which the translator is written

2. Assembler

• Assembler is a language translator which takes as input a program in assembly language of machine A and

generates its equivalent machine code for machine A, and the

3. Compiler
• A Compiler is a language translator that takes as input a source program in some HLL and converts it into a lower

level language (i.e. machine or assembly language).

• So, an HLL program is first compiled to generate an object file with machine-level instructions (i.e. compile time)

and then instructions in object file are executed (i.e. run time).

4. Interpreters

• An Interpreter is similar to a compiler, but one big difference is that it executes each line of source code as soon

as its equivalent machine code is generated. (This approach is different from a compiler, which compiles the
entire source code into an object file that is executed separately).

• If there are any errors during interpretation, they are notified immediately to the programmer and remaining

source code lines are not processed.


• The main advantage with Interpreters is that, since it immediately notifies an error to the programmer,

debugging becomes a lot easier.

5. Pre-processors

• A Pre-processor converts one HLL into another HLL.


• Typically pre-processors are seen as system software used to perform some additional functions (such as removal

of white spaces and comments) before the actual translation process can begin.

• So, input and output of pre-processor is generally the same HLL, only some additional functions have been

performed on the source.

6. Linker

• A Linker (or a Linkage Editor) takes the object file, loads and compiles the external subroutines from the library

and resolves their external references in the main-program.

• A Compiler generates an object file after compiling the source code. But this object file cannot be executed

immediately after it gets generated.

• This is because the main program may use separate subroutines in its code (locally defined in the program or

available globally as language subroutines). The external subroutines have not been compiled with the main
program and therefore their addresses are not known in the program.

7. Loaders
• A Loader does the job of coordinating with the OS to get the initial loading address for the program, prepares the

program for execution (i.e. generates an .exe file) and loads it at that address.

• Also, during the course of its execution, a program may be relocated to a different area of main memory by the

OS (when memory is needed for other programs).

An important job of Loader is to modify these address-sensitive instructions, so that they run correctly after
relocation.

You might also like