0% found this document useful (0 votes)
25 views25 pages

Module3 Part2

Module III covers assembler design options, focusing on machine-independent features such as literals, symbol defining statements, and expressions. It explains the implementation of literals, including their definition, usage, and how they are managed in literal tables. Additionally, it discusses program blocks and their significance in organizing code and data within the assembler's output.

Uploaded by

THEJALAKSHMI P K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views25 pages

Module3 Part2

Module III covers assembler design options, focusing on machine-independent features such as literals, symbol defining statements, and expressions. It explains the implementation of literals, including their definition, usage, and how they are managed in literal tables. Additionally, it discusses program blocks and their significance in organizing code and data within the assembler's output.

Uploaded by

THEJALAKSHMI P K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Module III System Software(S5 CSE)

MODULE III
• Assembler design options:
o Machine Independent assembler features
▪ Literals
▪ Symbol Defining Statements
▪ Expressions
▪ Program blocks
▪ Control sections
o Assembler design options
▪ Algorithm for Single Pass assembler
▪ Multi pass assembler
o Implementation example of MASM Assembler

• Machine Independent assembler features


o Following are the features which do not depend on the architecture of the machine.
▪ Literals
▪ Symbol Defining Statements
▪ Expressions
▪ Program blocks
▪ Control sections

o Literals
▪ Programmers can be able to write the value of a constant operand as a part of the
instruction. Such an operand is called literals.
▪ A literal is defined with a prefix =
▪ Eg: LDA =X’05’

▪ Literals vs Immediate Operand


• Literals
o In case of literals the assembler generates the specified value as a constant at
some other memory location
o Target Address(TA) is the address of this generated constant.
o The addressing mode of this instruction is either PC-relative or base-
relative.
o Eg:

1 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

▪ In the above example EOF is stored in location(002D)


▪ Consider the following statement in the above program
ENDFILL LDA =C’EOF’
▪ It has a 3-byte operand whose value is a character string EOF.
▪ This instruction follows Program Counter Relative addressing mode.
▪ TA= Address of the operand = (002D)
▪ After executing this instruction PC = (001D)
▪ Hence the displacement = TA - PC = (002D) - (001D)= (010)
▪ Therefore, the object code for this instruction is 032010

▪ Consider the following statement in the above program


WLOOP TD =X’05’
▪ It has a 1-byte operand with hexadecimal value 05.
▪ This instruction follows Program Counter Relative addressing mode.
▪ TA= Address of the operand = (1076)
▪ After executing this instruction PC = (1065)
▪ Hence the displacement = TA - PC = (1076) - (1065)= (011)
▪ Therefore, the object code for this instruction is E32011
• Immediate Operand
o In immediate mode the operand value is assembled as part of the instruction
itself.
o Eg:

o We can have literals in SIC, but immediate operand is only valid in SIC\XE
▪ Literal Pools
• All the literal operands used in a program are gathered together into one or more
literal pools.
• There are two ways to place the literals in the program
o Can place the literals at the end of the program (After END statement).
o Can place the literals at some other location in the object program.
▪ Reason: keep the literal operand close to the instruction
▪ An assembler directive LTORG is used.
▪ Whenever the LTORG is encountered, it creates a literal pool that
contains all the literal operands used since the beginning of the program
or since the previous LTORG.
▪ It is better to place the literals close to the instructions.
• If the literal operand would be placed too far away from the
instruction referencing, we cannot use PC-relative addressing or
Base-relative addressing to generate Object Program. Here we are
forced to choose extended instruction format. To avoid this we can
use LTORG in different places in the program.

▪ Literal Table (LITTAB)


• A literal table is a data structure created for the literals which are used in the
program.
• The literal table contains the literal name, operand value, length and address.

2 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

• LITTAB is often organized as a hash table, using the literal name or value as the
key

▪ Implementation of Literals
• During Pass-1:
o The literal encountered is searched in the literal table.
o If the literal already exists, no action is taken.
o If it is not present, the literal name, operand value and length are added to
the LITTAB.
o When encounters a LTORG statement or the end of the program
▪ The assembler makes a scan of the LITTAB and assigns an address for
each literal not yet assigned an address.
▪ Update the location counter value.
• During Pass-2:
o Search LITTAB for each literal operand encountered
o Literal values placed at correct locations in the object program.
o If the literal value represents an address in the program, the assembler must
also generate the appropriate Modification Record.

▪ Allow literals that refer to the current value of the location counter.
• ‘*’ denotes a literal refer to the current value of program counter
• Eg: LDB =*
▪ Duplicate literals
• The same literal used more than once in the program
• e.g. WLOOP TD =X’05’
---- ----- -----
---- ----- -----
WD =X’05’
• The assemblers should recognize duplicate literals and store only one copy of
the specified data value

o Symbol Defining Statements


▪ EQU Statement
• EQU is an assembler directive
• It allows the programmer to define symbols and specify their values
• Syntax: Symbol EQU value
• The value can be a constant or an expression involving constants and any other
symbol which is already defined.
o Eg: A EQU 10
B EQU X-Y

3 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

• Usage:
o To improve readability in place of numeric values
▪ Eg: Replace +LDT #4096
With
MAXLEN EQU 4096
+LDT # MAXLEN
o To define mnemonic names for registers
▪ Eg: Replace RMO 0,1
with
A EQU 0
X EQU 1
RMO A,X
• No forward reference
o One restriction with the usage of EQU is whatever symbol occurs in the
right hand side of the EQU should be predefined.
o Eg:

▪ ORG Statement
• ORG is an Assembler directive
• Allow the assembler to reset the PC to values
• Syntax: ORG value
• When ORG is encountered, the assembler resets its LOCCTR to the specified
value
• ORG will affect the values of all labels defined until the next ORG
• We can return to the normal use of LOCCTR by simply write ORG
• ORG is used to control assignment storage in the object program.
• No forward reference is allowed
o All symbols used to specify the new LOCCTR value must have been
previously defined.

o During pass1 assembler would not know what value to assign to the location
counter in response to the first ORG statement. As a result, the symbols
BYTE1, BYTE2 and BYTE3 could not be assigned during pass 1.

o Expressions
▪ The assemblers allow the expressions as operand
▪ The assembler evaluates the expressions and produces a single operand address or
value
▪ Expressions consist of
4 Prepared By: Dona Jose, AP, CSE,VJCET
Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

• Operator: +,-,*,/
• Constants
• User-defined symbols
• Special terms: *, the current value of LOCCTR
• Examples
MAXLEN EQU BUFEND-BUFFER
STAB RESB (6+3+2)*MAXENTRIES
BUFEND EQU *
The current value of location counter is assigned to BUFEND.
▪ Values of terms can be classified as absolute or relative.
• Absolute terms
o Independent of program location
o Eg: Constants
MAXLEN EQU 1000
• Relative terms
o Defined relative to the beginning of the program
o Eg:
▪ Labels on instructions
▪ References to location counter: *
▪ Expressions can be either absolute or relative
• Absolute Expression
o Expression contains only absolute terms
MAXLEN EQU 1000+5
o Relative terms in pairs with opposite signs for each pair
MAXLEN EQU BUFEND-BUFFER
▪ BUFEND and BUFFER both are relative terms, representing addresses
within the program. The expression BUFEND-BUFFER represents an
absolute value.
▪ When relative terms are paired with opposite signs, the dependency on
the program starting address is canceled out. The result is an absolute
value.
▪ No relative term may enter into a multiplication or division operation.
• Relative Expression
o Contains an odd number of relative terms, with one more positive term than
negative term.
STAB EQU OPTAB + (BUFEND – BUFFER)
o No relative term may enter into a multiplication or division operation.
▪ Eg: 3*BUFFER is incorrect.
• Expressions that are neither absolute nor relative will lead to assembler error.
o Eg:
▪ BUFEND+BUFFER
▪ 100-BUFFER
▪ 3*BUFFER
▪ Defining Symbol Types in the Symbol Table
• To find the type of expression, we must keep track of the types of all symbols
defined in this program.

5 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

• For this purpose we need a flag in the SYMTAB to indicate type of value
(absolute or relative) in addition to the value itself.

• With this information the assembler can easily determine the type of each
expression used as an operand and generate Modification Record in the object
program for relative values.

o Program blocks
▪ The source programs logically contained subroutines, data area etc.
▪ Within the object program the generated machine instructions and data appeared in
the same order as they were written in the source program.
▪ Program blocks allow the generated machine instructions and data to appear in a
different order while they are loading in memory.
• Separating blocks for storing code, data, stack, and larger data block
▪ Assembler directive: USE
• Syntax: USE [blockname]
• USE indicates which portion of the source program belongs to the various
blocks.
• At the beginning, statements are assumed to be part of the default block
• If no USE statements are included, the entire program belongs to this single
block
• Each program block may actually contain several separate segments of the
source program
▪ Assembler rearrange these segments to gather together the pieces of each block and
assign address
• Separate the program into blocks in a particular order
• Large buffer area is moved to the end of the object program
• Program readability is better if data areas are placed in the source program close
to the statements that reference them.
• Consider the following program. Here 3 blocks are used.
o Unnamed (default) block (block no: 0): contains the executable instructions
of the program.
o CDATA block (block no: 1): contains all data areas that are less in length.
o CBLKS block (block no: 2): contains all data areas that consist of larger
blocks of memory
• At the beginning, statements are assumed to be part of the default block
• The USE statement on line 92 signals the beginning of the block named
CDATA.
• The USE statement on line 103 signals the beginning of the block named
CBLKS.

6 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

• The USE statements on line 123 and 208 resume the default block, and the
statements on line 183 and 252 resume CDATA block.
• Line 107 is shown without a block number because the value of MAXLEN is an
absolute symbol.

7 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

▪ Pass 1
• A separate location counter for each program block
o At the beginning of a block, LOCCTR is set to 0.
o Save and restore LOCCTR when switching between blocks
• Assign each label an address relative to the start of the block that contains it.
• Store the block name (or number) in the SYMTAB along with the assigned
relative address of the label
• At the end of Pass1 the latest value of LOCCTR for each block indicates the
length of that block.
• At the end of Pass1 the assembler constructs a block table that contains the
block name, block number, starting addresses and length of all blocks.

▪ Pass 2
• Calculate the address for each symbol relative to the start of the object program
by adding the location of the symbol relative to the start of its block, to the
assigned block starting address.

• Eg:
o Consider the instruction LDA LENGTH
o The relative location of LENGTH in CDATA block = 0003
o Starting address for CDATA = 0066
o Therefore, TA = 0003 + 0066 = 0069
o This instruction is to be assembled using PC-relative addressing mode.
o After fetching this instruction, PC = 0009. Since the default block starts at
location 0000, this address = 0000 + 0009 = 0009
o Displacement = TA-PC = 0069 – 0009 = 0060
o Therefore, the object code is 032060

• First 2 text records are generated from lines 5 through 70(default block 1).
• No new text record is created for lines 95 through 105, because it is not
generated any code. Next 2 text records come from lines 125 through
180(default block 2). Fifth text record is for CDATA 2 block and so on.

8 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

• The loader loads the default block in the memory from location 0000 to 0065.
CDATA will occupy locations from 0066 through 0070. CBLKS will occupy
locations 0071 through 1070.
• CDATA(1) and CBLKS(1) are not present in object program. Storage will
automatically be reserved for these areas when the program is loaded.
▪ Benefits of Program Blocks
• Here the larger buffer area is moved to the end of the object program. So we can
avoid the use of extended instruction format.
• Program readability is improves if the definition of data areas are placed in the
source program close to the statements that reference them.

Pass 1 Algorithm
Begin
block number = 0
LOCCTR[i] = 0 for all i
Read the first input line
If OPCODE = ‘START’ then
{
Write line into intermediate file
Read next input line
}
While OPCODE != ‘END’ do
{ If OPCODE = ‘USE’ then
{ If there is no operand name then block name = Default
Else block name = OPERAND name
If there is no entry for block name in block table then
Insert (block name, block no++) in to block table
i = bock number for block name
if there is not a comment line then
{ If there is a symbol in the LABEL field then
9 Prepared By: Dona Jose, AP, CSE,VJCET
Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

{
Search SYMTAB for LABEL
If found then Set error flag
Else Insert (LABEL, LOCCTR[i], block number) into SYMTAB
}
Search OPTAB for OPCODE
If found then LOCCTR[i] = LOCCTR[i] +3
Else if OPCODE = ‘WORD’ then LOCCTR[i] = LOCCTR[i] +3
Else if OPCODE = ‘RESW’ then LOCCTR[i] = +3 * #OPERAND
Else if OPCODE = ‘RESB’ then LOCCTR[i] = + #OPERAND
Else if OPCODE = ‘BYTES’ then LOCCTR[i]= +length of the constant
Else Set error flag
}
Write line into intermediate file
Read next input line
}
}
LENGTH[i] = LOCCTR[i] for all i
Address[0] = starting address
Address[i] = Address[i-1] + Length[i-1] for all i=1 to max(block number)
Insert (Address[i], Length[i]) in block table for all i
End

Pass 2 Algorithm

o Control sections
▪ Program blocks v.s. Control sections
• Program blocks: Segments of code that are rearranged within a single object
program unit
• Control sections: Segments of code that are translated into independent object
program units
▪ These are most often used for subroutines or other logical subdivisions of a program
▪ The programmer can assemble, load, and manipulate each of these control sections
separately
▪ Assembler directive: CSECT
• Syntax: secname CSECT
▪ Separate location counter for each control section. Initial value of the location
counter is 0.
▪ Instructions in one control section may need to refer to instructions or data located in
another section. Assembler has no idea where any other control sections will be
located at execution time.
10 Prepared By: Dona Jose, AP, CSE,VJCET
Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

▪ It is necessary to provide some means of linking them together. For this purpose we
can use the following 2 assembler directives
• External definition EXTDEF symbol1,symbol2, . . . . . ,symboln
o Define symbols that are defined in this control section and may be used by
other sections
o Ex: EXTDEF BUFFER, BUFEND, LENGTH

• External reference EXTREF symbol1,symbol2, . . . . . ,symboln


o Define symbols that are used in this control section and are defined
elsewhere
o Ex: EXTREF RDREC, WRREC
o To reference an external symbol, extended format instruction is needed.

▪ The following program consist of 3 control sections


• COPY: Main program. This section continues until the CSECT statement on
line 109.
• RDREC: Subroutine. This control section is from line no 109 to 190.
• WRREC: Subroutine. This control section is from line no 193 to 255.
▪ Ex: Consider the instruction
15 0003 CLOOP +JSUB RDREC
• RDREC is an external reference.
• The assembler has no idea where RDREC is
• The assembler inserts an address of zero.
• The proper address to be inserted at load time
• Can only use extended format to provide enough room (that is, relative
addressing for external reference is invalid)
• The object code is: 4B100000
• The assembler generates information for each external reference that will allow
the loader to perform the required linking.
▪ Ex: Consider the instruction
160 0017 +STCH BUFFER,X
• BUFFER is an external reference. The assembler has no idea where BUFFER is
• The assembler inserts an address of zero
• The object code is: 57900000
▪ Ex: Consider the instruction
190 0028 MAXLEN WORD BUFEND-BUFFER
• BUFEND and BUFFER are two eternal reference symbols.
• Assembler inserts a value of 0
• The object code is: 000000
• When the program is loaded, the loader will add to this data area the address of
BUFEND and subtract from it the address of BUFFER.
▪ Ex: Consider the instruction
107 1000 MAXLEN EQU BUFEND-BUFFER
• BUFEND and BUFFER are defined in the same control section and the
expression can be calculated immediately
▪ Restriction

11 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

• Both terms in each pair of an expression must be within the same control
section
o Legal: BUFEND-BUFFER
o Illegal: RDREC-COPY

12 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

▪ The assembler must include information in the object program that will cause the
loader to insert proper values where they are required. Define Record; Refer Record
and Modification Record are used for this purpose.

▪ The object program corresponding to the above is

13 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

• Assembler Design Options


o Single Pass Assembler
o Multipass Assembler

o Single Pass Assembler


▪ The main problem in designing the assembler using single pass was to resolve
forward references. There are two types of forward references.
• Forward reference to data items
o Solution
▪ Define all the storage reservation statements at the beginning of the
program rather at the end.
• Forward jumping: Forward reference to labels on the instructions
o Solution
▪ Insert (label, address_to_be_modified) to SYMTAB
▪ Usually, address_to_be_modified is stored in a linked-list
▪ There are two types of one-pass assemblers:
• Load-and-go assemblers:
o Generates object code directly in memory for immediate execution.
o No object program is written out, no loader is needed.
o The actual address must be known at assembly time.
o It is useful in a system with frequent program development and testing
o Programs are re-assembled nearly every time they are run.
• Object Program Output Assembler:
o This assembler produces the usual kind of object code for later execution.
o This assembler is used on systems where external working storage devices
are not available.
▪ Load-and-go assemblers Algorithm
• When a forward reference is encountered
o Omits the operand address if the symbol has not yet been defined
o Enters this undefined symbol into SYMTAB and indicates that it is
undefined
o Adds the address of this operand address to a list of forward references
associated with the SYMTAB entry
• When the definition for the symbol is encountered, scans the reference list and
inserts the address.
• At the end of the program, reports the error if there are still SYMTAB entries
indicated undefined symbols. Otherwise jump to the location specified in END
statement.

14 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

▪ The following program avoids forward data reference problem

▪ In the main subroutine RDREC(line 15), WRREC(line 35, line 65) and
ENDFILL(line 30) are forward references.
▪ After scanning line 40 of the above program

15 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

• The following symbols are not yet defined.


o RREC is referred to the location 2013
o ENDFIL is referred to the location 201F
o WRREC is referred to the location 201C
▪ After scanning line 160 of the above program

• Some of the forward references have been resolved by this time, while others
have been added.
• When the symbol ENDFILL was defined (line 45), the assembler places 2024 in
the SYMTAB entry.
• Insert 2024 in the location 201C. Then delete the linked list.
• Similar operations are happened for all forward references.

16 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

▪ Load-and-go Single Pass Assembler Algorithm


Begin
Read 1st input line
If OPCODE = ’START’ then
{
Starting address = #OPERAND
LOCCTR = Starting address
Read the next input line
}
Else
LOCCTR = 0
While OPCODE != ‘END’ do
{
If there is not a comment line then
{
If there is a symbol in the LABEL field then
{
Search SYMTAB for LABEL
If found then
{
If symbol value as null then
{
Symbol value = LOCCTR
Search the linked list with corresponding operand
Generate operand addresses as corresponding to symbol value
Delete the linked list
}
}
Else
Insert (LABEL, LOCCTR) into SYMTAB
}
Search OPTAB for OPCODE
If found then
{
Search SYMTAB for OPERAND address
If found then
{
If symbol value != null then
OPERAND address = symbol value
Else
Insert a node at the end of the linked list with address as LOCCTR+1
}
Else
{ Insert (symbol name, null) into SYMTAB
Create a linked list with address as LOCCTR+1
}
Generate object code and load it in memory location LOCCTR

17 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

LOCCTR = LOCCTR +3
}
Else if OPCODE = ‘WORD’ then
{
Object code = #OPERAND
load this object code in memory location LOCCTR
LOCCTR = LOCCTR +3
}
Else if OPCODE = ‘RESW’ then
LOCCTR = LOCCTR +3x#OPERAND
Else if OPCODE = ‘RESB’ then
LOCCTR = LOCCTR + #OPERAND
Else if OPCODE = ‘BYTE’ then
{
Convert constant to object code and load it in memory location LOCCTR
LOCCTR = LOCCTR +length of the constant
}
Else
Set error flag
}
Read the next input line
}
If there are still SYMTAB entries indicated undefined symbols
Reports the error
Else
Jump to the location specified in END statement.
End

▪ Object Program Output Assembler


• Forward references are entered into SYMTAB as before.
• When the definition of the symbol is encountered, the assembler generates
another Text Record with the correct operand address of each entry in the linked
list.
• When the program is loaded, the incorrect address 0 will be updated by the Text
Record containing the symbol definition.
• The object program records must be kept in their original order when they are
presented to the loader.
• The object code for the above program is

18 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

o When the definition of ENDFIL on line 45 is encountered, the assembler


generates the 3rd Text Record. This record specifies that the value 2024 is to
be loaded at location 201C. When the program is loaded the value 2024 will
replace the 0000 previously loaded.
▪ Object Program Output Single Pass Assembler Algorithm
Begin
Read 1st input line
If OPCODE = ’START’ then
{
Starting address = #OPERAND
LOCCTR = Starting address
Read the next input line
}
Else
LOCCTR = 0
Create Header Record and write it to object program
Initialize 1st Text Record
While OPCODE != ‘END’ do
{
If there is not a comment line then
{
If there is a symbol in the LABEL field then
{
Search SYMTAB for LABEL
If found then
{
If symbol value as null then
{
Symbol value = LOCCTR
Generate separate Text record with corresponding operand address
of each entry in the linked list
Delete the linked list
}
}
Else
Insert (LABEL, LOCCTR) into SYMTAB
}
Search OPTAB for OPCODE
If found then
{
Search SYMTAB for OPERAND address
If found then
{
If symbol value != null then
OPERAND address = symbol value
Else
Insert a node at the end of the linked list with address as LOCCTR+1

19 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

}
Else
{ Insert (symbol name, null) into SYMTAB
Create a linked list with address as LOCCTR+1
}
Generate object code
LOCCTR = LOCCTR +3
}
Else if OPCODE = ‘WORD’ then
{
LOCCTR = LOCCTR +3
Object code = #OPERAND
}
Else if OPCODE = ‘RESW’ then
LOCCTR = LOCCTR +3x#OPERAND
Else if OPCODE = ‘RESB’ then
LOCCTR = LOCCTR + #OPERAND
Else if OPCODE = ‘BYTE’ then
{
LOCCTR = LOCCTR +length of the constant
Convert constant to object code
}
Else
Set error flag
If object code will not fit into the current text record then
{
Write Text Record into object program
Initialize new Text Record
}
Add object code to Text Record
}
Read the next input line
}
Write last Text Record to object program
Write End Record to object program
End

o Multipass Assembler
▪ The symbols used on the RHS of EQU should be defined previously in the program.
▪ Eg:

• The symbol BETA cannot be assigned a value when it is encountered during


Pass1 because DELTA has not yet been defined.
• Hence ALPHA cannot be evaluated during Pass 2.
• Symbol definition must be completed in pass 1.

20 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

▪ Forward references tend to create difficulty for a person reading the program.
▪ The general solution for forward references is a multi-pass assembler that can make
as many passes as are needed to process the definitions of symbols.
▪ It is not necessary for such an assembler to make more than 2 passes over the entire
program.
▪ The portions of the program that involve forward references in symbol definition are
saved during Pass 1.
▪ Additional passes through these stored definitions are made as the assembly
progresses.
▪ This process is followed by a normal Pass 2.
▪ Implementation
• For a forward reference in symbol definition, we store in the SYMTAB:
o The symbol name
o The defining expression
o The number of undefined symbols in the defining expression
• The undefined symbol (marked with a flag *) associated with a list of symbols
depend on this undefined symbol.
• When a symbol is defined, we can recursively evaluate the symbol expressions
depending on the newly defined symbol.
▪ Example
1 A EQU B/2
2 B EQU C-D
3 E EQU D-1
4 D RESB 4096
5 C EQU *
• After executing statement 1, the SYMTAB will become

o &1 represent the number of undefined symbols in the defining expression


o B/2 is the defining expression
o * indicate the undefined symbol
o The node A represents depending list.
• After executing statement 2, the SYMTAB will become

• After executing statement 3, the SYMTAB will become

21 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

• Suppose the address of D is 1034. After executing statement 4, the SYMTAB


will become

• After executing statement 5, C will be LOCCTR. The SYMTAB will become

o Implementation example of MASM Assembler


▪ An MASM assembler program is written as a collection of segments.
▪ Commonly used segments are CODE, DATA, CONST and STACK.
▪ Segments are addressed via segment registers
▪ These registers are automatically set by the system loader when a program is loaded
for execution.
• CODE segment → CS register
o If CS is set, then the current segment contains the label specified in the END
statement.
• STACK → SS register
o SS is set indicate the last stack segment is processed by the loader.
• DATA and CONST → DS, ES, FS or GS registers
o If the programmer does not specify a segment register, one is selected by the
assembler.
o Default register is DS.
o This can be changed by using ASSUME assembly directive
ASSUME ES:DATASEG2
ES indicates the segment DATASEG2. Any references to labels that are
defined in DATASEG2 will be assembled using register ES
▪ Jump instructions are assembled in 2 different ways
• Near Jump
o The target will be in the same code segment
o It is assembled using the current code segment register CS
o Instruction size may be 2 or 3 bytes
• Far Jump
o The target will be in a different code segment
o It is assembled using a different segment register, which is specified in an
instruction prefix.
o Instruction size is 5 bytes
▪ Forward references to a label in a source program can cause problems:

22 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

• Eg: JMP TARGET


o If the definition of TARGET occurs in the program before JMP instruction,
the assembler can tell whether this is a near jump or far jump. It is not
possible in the case of forward jump.
o By default, MASM assumes that a forward jump is a near jump.
o If the target of the jump is in another code segment, the programmer must
warn the assembler by writing JMP FAR PTR TARGET
o If the jump address is within 128 bytes, the programmer can specify a
shorter(2 bytes) near bytes by writing JMP SHORT TARGET
▪ Length of the assembled instruction is depends on its operand
• Eg: operands of ADD instruction can be
o Registers
o Memory locations: May take varying amount of space, depending upon the
location of the operand.
o Immediate operands: May occupy from 1 to 4 bytes in the instruction
▪ Pass 1 of an x86 assembler is more complex than Pass 1 of SIC assembler
• During Pass 1 of x86
o Analyze the operands of each instruction
o Looking at the operation code table
▪ It contains information on which addressing modes are valid for each
operand.
▪ Segments in a MASM source program can be written in more than one part.
• All the parts are gathered together by the assembly process.
▪ References between segments are handled by the assembler.
• Use the directive PUBLIC. It has the same function as EXTDEF in SIC/XE.
▪ External references between separately assembled modules must be handled by the
linker.
• Use the directive EXTRN. It has the same function as EXTREF in SIC/XE.
▪ The object program from MASM may be in several different formats
• Allow easy and efficient execution of the program in a variety of operating
environments.
▪ MASM produces an instruction timing that shows the number of clock cycles
required to execute each instruction.
Previous Year University Questions
1. What is a Literal? How is a literal handled by an assembler?
2. With example, write notes on Program Blocks.
3. How the assembler handles multiple Program blocks?
4. What are control sections? What is the advantage of using them?
5. What are control sections? Illustrate with an example, how control sections are used and
linked in an assembly language program.
6. Explain the format and purpose of Define and Refer records in the object program.
7. What are the uses of assembler directives EXTDEF and EXTREF?
8. How are control sections different from program blocks? Explain, with proper examples,
the purpose of EXTREF and EXTDEF assembler directives.
9. Give the format and purpose of the different record types present in an object program
that uses multiple control sections

23 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

10. Develop the records (excluding header, text and end records) for the following control
section named COPY

11. Explain how external references are handled by an assembler.


12. Distinguish between Program Blocks and Control Section
13. Differentiate between control sections and program blocks with the help of an example.
14. Differentiate Program Blocks and Control Sections. Explain how address calculation is
performed in the case of Program Blocks
15. What is a load and go assembler?
16. Explain the concept of single pass assembler with a suitable example.
17. Explain the working of any one type of One pass Assembler
18. What is a forward reference? How are forward references handled by a single pass
assembler?
19. Explain the working of Multi pass assemblers with an example.
20. Employ multipass assembler to evaluate the following expressions

24 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck
Module III System Software(S5 CSE)

21. Write short notes on MASM assembler.

25 Prepared By: Dona Jose, AP, CSE,VJCET


Reference Book: System Software: An Introduction to System Programming, Leland L Beck

You might also like