Compiler Design
Department of Computer Science
1
Code Generation
2
Code Generation
• Code generator is final phase of a compiler model.
• It takes as input the intermediate representation (IR)
produced by the front-end of the compiler, along with relevant
symbol table information.
• Produces as output a semantically equivalent target program.
• Code generator main tasks:
– Instruction selection
– Register allocation and assignment
– Instruction ordering
3
Issues in the Design of a Code Generator
• The most important criterion is that it produces correct code
• Input to the code generator
– IR + Symbol table
– We assume front end produces low-level IR, i.e. values of
names in it can be directly manipulated by the machine
instructions.
– Syntactic and semantic errors have been already detected
• The target program
– Common target architectures are: RISC, CISC and Stack
based machines
4
Instruction Selection
• The code generator must map the IR program into a code
sequence that can be executed by the target machine.
• The complexity of performing this mapping is determined by a
factors such as
– the level of the IR
– the nature of the instruction-set architecture
– the desired quality of the generated code.
LD R0, b
ADD R0, R0, c
LD R0, y a=b+c ST a, R0
x=y+z LD R0, a
ADD R0, R0, z d=a+e
ST x, R0 ADD R0, R0, e
ST d, R0
5
Registers
• The use of registers is often subdivided into two sub-problems:
1. Register allocation, during which we select the set of
variables that will reside in registers at each point in the
program.
2. Register assignment, during which we pick the specific
register that a variable will reside in.
6
Register Allocation
• A key problem in code generation is deciding what values to
hold in what registers.
• Registers are the fastest computational unit on the target
machine, but we usually do not have enough of them to hold
all values.
• Values not held in registers need to reside in memory.
• Instructions involving register operands are invariably shorter
and faster than those involving operands in memory. So
efficient utilization of registers is particularly important.
7
Code Generation Optimization
8
Basic Blocks (BB)
• A basic block is defined as a sequence of consecutive
statements with only one entry (at the beginning) and one exit
(at the end).
• A group of atoms or intermediate code which contains no
label or branch. (i.e., LBL, TST, JMP).
• In order to determine all the Basic Block in a program, we
need to identify the leaders, the first statement of each Basic
Block.
• Any statement that satisfies the following conditions is a
leader:
9
Basic Blocks (BB)
The first statement is leader.
Any statement which is the target of any goto (jump) is a
leader.
Any statement that immediately follows a goto (jump) is a
leader.
• A basic block is defined as the portion of code from one leader
to the statement up to but including the next leader or the end
of the program.
10
Basic Blocks (BB)
(ADD, b, c, T1)
(LBL, L1)
(ADD, b, c, T2)
(MUL, T1, T2,
T3)
(TST, b, c, 1, L3)
(MOV, T3, a)
Example of an Atom
Sequence Which
Cannot be Optimized
11
DAG Representation of a Basic Block
• Many optimizing transformations can be implemented using
DAG representation of a Basic Block.
• DAG stands for Directed Acyclic Graph i.e. a graph with
directed edges and no cycles.
• DAG is very much like a tree but differs in that it may contain
shared nodes where shared nodes indicate common sub-
expressions.
• A DAG has the following components:
Leaves are labeled by unique identifiers, either variable names or
constants.
Interior nodes are labeled by an operator symbol.
Nodes are optionally given an extra set of identifiers known as attached
identifiers.
12
DAG Construction (1)
• We assume there are initially no nodes and NODE () is
undefined for all arguments.
• The 3-address statements has one of three cases:
(i) A = B op C
(ii) A = op B
(iii) A = B
• We shall do the following steps (1) through (3) for each 3-
address statement of the basic block:
1) If NODE (B) is undefined, create a leaf labeled B, and let
Node (B) be this node. In case (i), if NODE (C) is undefined,
create a leaf labeled C and let that leaf be NODE (C);
13
DAG Construction (2)
2) In case (i), determine if there is a node labeled op whose left
child is NODE (B) and whose right child is NODE (C). (This
is to catch common sub-expressions.) If not create such a
node. In case (ii), determine whether there is a node labeled
op whose lone child is NODE (B). If not create such a node.
Let n be the node found or created in both cases in case (iii),
let n be NODE (B).
3) Append A to the list of attached identifiers for the node n in
(2). Delete A from the list of attached identifiers for NODE
(A). Finally, set NODE (A) to n.
14
Example: DAG construction from BB (1)
• Find a DAG for the given blocks:
+
a := b + c b,d -
b := a – d
a +
d0
d := a – d
c := d + c b0 c0
15
Example: DAG construction from BB (2)
• Find a DAG for the block:
+ e
a := b + c
b := b – d + a - b + c
c := c + d
e := b + c b0 c0 d0
16
Example: DAG construction from BB (3)
t1 := 4 * i
t2 := a [ t1 ] + t5,i
t3 := 4 * i
t4 := b [ t3 ]
t5 := t2 + t4 t4 [] [] t2
i := t5
* t1, t3
b a 4 i
17
Optimization of Basic Blocks (1)
• Common sub-expression elimination: by construction of
DAG
– Note: for common sub-expression elimination, we are
actually targeting for expressions that compute the same
value.
a := b + c
b := b – d Common expressions
c := c + d But do not generate the
e := b + c same result
18
Optimization of Basic Blocks (2)
• Dead code elimination: Code generation from DAG
eliminates dead code.
c +
a := b + c
a := b + c
b := a – d ×b,d - d := a - d
d := a – d
c := d + c
c := d + c a +
d0
b is not live
b0 c0
19
DAG to BB
• The algorithm given below will generate atoms (in reverse
order) in which all common sub-expressions are evaluated
only once:
1. Choose any node having no incoming arcs (initially there
should be only one such node, representing the value of the
entire expression).
2. Put out an atom for its operation and its operands.
3. Delete this node and its outgoing arcs from the DAG.
4. Repeat from Step 1 as long as there are still operation nodes
remaining in the DAG.
20
DAG to BB (1)
• Construct the DAG and show the optimized sequence of atoms
for the C++ expression a ∗ b + a ∗ b + a ∗ b
• The atom sequence as put out by the parser would be:
(MUL, a, b, T1)
(MUL, a, b, T2)
(ADD, T1, T2, T3)
(MUL, a, b, T4)
(ADD, T3, T4, T5)
21
DAG to BB (2)
(MUL, a, b, T1)
(MUL, a, b, T2)
(ADD, T1, T2, T3)
(MUL, a, b, T4)
(ADD, T3, T4, T5)
22
DAG to BB (3)
• Generating the optimized code (BB) based on the algorithm:
MUL a, b, T1.2.4
ADD T1.2.4, T1.2.4,
T3
ADD T3, T1.2.4, T5
23
Peephole optimization
• Any optimization which is applied to the generated code is
considered local.
• Local optimization techniques are often called peephole
optimization.
• the following examples of program transformations that are
characteristic of peephole optimizations:
load/store optimization,
jump over jump optimization, and
simple algebraic optimization.
24
load/store optimization
• removing redundant load/store.
• Example: a + b - c
• The parser would translate the expression a + b - c into the
following stream of atoms:
(ADD, a, b, T1)
(SUB, T1, c, T2)
• Code generator design generate three instructions
corresponding to each atom:
– (i.e. Load the first operand into a register (LOD), perform
the operation, and store the result back to memory (STO).)
25
load/store optimization
• The code generator would then produce the following
instructions from the atoms.
LOD R1,a
ADD R1,b LOD R1,a
STO R1,T1 ADD R1,b
LOD R1,T1 SUB R1,c
SUB R1,c STO R1,T2
STO R1,T2
• The third and fourth instructions in this sequence are entirely
unnecessary since the value being stored and loaded is already
at its destination.
26
Simple algebraic optimizations
• Worth recognizing single instructions with a constant operand:
A*1=A
A*0=0
A/1=A
A*2=A+A
A^2=A*A
• the following instructions can be eliminated because
multiplying a value by 1, or adding 0 to a value, should not
change that value.
MUL R1, 1
ADD R1, 0
27