Introduction
Intermediate code is the interface between front end and
back end in a compiler
Ideally the details of source language are confined to the
front end and the details of target machines to the back
end (a m*n model)
In this chapter we study intermediate representations,
static type checking and intermediate code generation
Static Intermediate Code
Parser
Checker Code Generator Generator
Front end Back end
Variants of syntax trees
It is sometimes beneficial to crate a DAG instead of
tree for Expressions.
This way we can easily show the common sub-
expressions and then use that knowledge during code
generation
Example: a+a*(b-c)+(b-c)*d
+ *
*
d
a -
b c
SDD for creating DAG’s
Production Semantic Rules
1) E -> E1+T E.node= new Node(‘+’, E1.node,T.node)
2) E -> E1-T E.node= new Node(‘-’, E1.node,T.node)
3) E -> T E.node = T.node
4) T -> (E) T.node = E.node
5) T -> id T.node = new Leaf(id, id.entry)
6) T -> num T.node = new Leaf(num, num.val)
Example:
1)p1=Leaf(id, entry-a) 8) p8=Leaf(id,entry-b)=p3
2)P2=Leaf(id, entry-a)=p1 9) p9=Leaf(id,entry-c)=p4
3)p3=Leaf(id, entry-b) 10) p10=Node(‘-’,p3,p4)=p5
4)p4=Leaf(id, entry-c) 11) p11=Leaf(id,entry-d)
5)p5=Node(‘-’,p3,p4) 12) p12=Node(‘*’,p5,p11)
6)p6=Node(‘*’,p1,p5) 13) p13=Node(‘+’,p7,p12)
7)p7=Node(‘+’,p1,p6)
Value-number method for
constructing DAG’s
= id To entry for i
num 10
+ + 1 2
3 1 3
i 10
Algorithm
Search the array for a node M with label op, left child l and
right child r
If there is such a node, return the value number M
If not create in the array a new node N with label op, left
child l, and right child r and return its value
We may use a hash table
Three address code
In a three address code there is at most one operator
at the right side of an instruction
Example:
+
t1 = b – c
+ * t2 = a * t1
t3 = a + t2
* t4 = t1 * d
d
t5 = t3 + t4
a -
b c
Forms of three address
instructions
x = y op z
x = op y
x = y
goto L
if x goto L and ifFalse x goto L
if x relop y goto L
Procedure calls using:
param x
call p,n
y = call p,n
x = y[i] and x[i] = y
x = &y and x = *y and *x =y
Example
do i = i+1; while (a[i] < v);
L: t1 = i + 1 100: t1 = i + 1
i = t1 101: i = t1
t2 = i * 8 102: t2 = i * 8
t3 = a[t2] 103: t3 = a[t2]
if t3 < v goto L 104: if t3 < v goto 100
Symbolic labels Position numbers
Data structures for three
address codes
Quadruples
Has four fields: op, arg1, arg2 and result
Triples
Temporaries are not used and instead references to
instructions are made
Indirect triples
In addition to triples we use a list of pointers to triples
Three address code
Example t1 = minus c
t2 = b * t1
b * minus c + b * minus c t3 = minus c
t4 = b * t3
t5 = t2 + t4
a = t5
Quadruples Triples Indirect Triples
op arg1 arg2 result op arg1 arg2 op op arg1 arg2
minus c t1 0 minus c 35 (0) 0 minus c
* b t1 t2 1 * b (0) 36 (1) 1 * b (0)
minus c t3 2 minus c 37 (2) 2 minus c
* b t3 t4 3 * b (2) b (2)
38 (3) 3 *
+ t2 t4 t5 4 + (1) (3) 39 (4) 4 + (1) (3)
= t5 a 5 = a (4) 40 (5) 5 = a (4)
Type Expressions
Example: int[2][3]
array(2,array(3,integer))
A basic type is a type expression
A type name is a type expression
A type expression can be formed by applying the array type constructor to
a number and a type expression.
A record is a data structure with named field
A type expression can be formed by using the type constructor for
function types
If s and t are type expressions, then their Cartesian product s*t is a type
expression
Type expressions may contain variables whose values are type expressions
Type Equivalence
They are the same basic type.
They are formed by applying the same constructor to
structurally equivalent types.
One is a type name that denotes the other.
Assignment Statements
Expressions can be of type integer, real,
array, and record.
As part of assignments into three address
code, show how names can be looked up in
the symbol table and how elements of
arrays and records can be accessed.
Names in the Symbol table
The lexeme for the name represented by id
is given by the attribute id.name.
Operation lookup(id.name) checks if there
is an entry for this occurrence of the name
in the symbol table.
If so, a pointer to the entry is returned:
otherwise lookup returns nil to indicate that
no entry was found.
Reusing temporary names
Assuming that newtemp generates a new
temporary name each time a temporary is
needed.
It is useful especially in optimizing
compilers, to actually create a distinct name
each time newtemp is called.
Temporaries can be reused by changing
newtemp .
Addressing array elements
base+(i-low)*w
Where low is the lower bound on the subscript and
base is the relative address of the storage allocated for
the array.
w is the width of array element
Boolean Expressions
compute logical values
change the flow of control
boolean operators are: and or not
E → E or E | E and E | not E | (E) | id
relop id | true | false
Methods of translation
• Evaluatesimilar to arithmetic expressions –
Normally use 1 for true and 0 for false •
implement by flow of control –
given expression E1 or E2 if E1 evaluates to
true
then E1 or E2 evaluates to true
without evaluating E2
Syntax directed translation of
boolean expressions
E → E1 or E2
E.place := newtmp
emit(E.place ':=' E1.place 'or' E2.place)
E → E1 and E2
E.place:= newtmp
emit(E.place ':=' E1.place 'and' E2.place)
E → not E1
E.place := newtmp
emit(E.place ':=' 'not' E1.place)
E → (E1) E.place = E1.place
E → id1 relop id2
E.place := newtmp
emit(if id1.place relop id2.place goto
nextstat+3)
emit(E.place = 0)
emit(goto nextstat+2)
emit(E.place = 1)
E → true
E.place := newtmp
emit(E.place = '1')
E → false
E.place := newtmp
emit(E.place = '0')
Flow of control statements
S → if E then S1
| if E then S1 else S2
| while E do S1
Syntax directed translation
S → if E then S1
E.true = newlabel
E.false = S.next
S1.next = S.next
S.code = E.code ||
gen(E.true ':') ||
S1.code
S → if E then S1 else S2
E.true = newlabel
E.false = newlabel
S1.next = S.next
S2.next = S.next
S.code = E.code ||
gen(E.true ':') ||
S1.code ||
gen(goto S.next) ||
gen(E.false ':') ||
S2.code
S → while E do S1
S.begin = newlabel
E.true = newlabel
E.false = S.next
S1.next = S.begin
S.ocde = gen(S.begin ':') ||
E.code ||
gen(E.true ':') ||
S1.code ||
gen(goto S.begin)
BackPatching
It is a way to implement boolean
expressions and flow of control statements
in one pass
• code is generated as quadruples into an
array
• labels are indices into this array
Backpatching
Previous codes for Boolean expressions
insert symbolic labels for jumps.
It therefore needs a separate pass to set
them to appropriate addresses, we can
use a technique named backpatching to
avoid this
We assume we save instructions into an
array and labels will be indices in the
array
Backpatching
For nonterminal B we use two attributes
B.truelist and B.falselist together with
following functions:
makelist(i): create a new list containing only I,
an index into the array of instructions
Merge(p1,p2): concatenates the lists pointed by
p1 and p2 and returns a pointer to the
concatenated list
Backpatch(p,i): inserts i as the target label for
each of the instruction on the list pointed to by
p
Backpatching for Boolean Expressions
Backpatching for Boolean Expressions
Annotated parse tree for x < 100 || x > 200 && x ! = y
Flow-of-Control Statements
Translation of a switch-statement