Chapter eight
Code generation and Optimization
Introduction code optimization
Optimization is a program transformation technique, which tries to improve the code by
making it consume fewer resources (i.e. CPU, Memory) and deliver high speed.
In optimization, high-level general programming constructs are replaced by very efficient
low-level programming codes. A code optimizing process must follow the three rules given
below:
The output code must not, in any way, change the meaning of the program.
Optimization should increase the speed of the program and if possible, the program
should demand less number of resources.
Optimization should itself be fast and should not delay the overall compiling process.
Introduction code optimization
Efforts for an optimized code can be made at various levels of compiling the process.
At the beginning, users can change/rearrange the code or use better algorithms
to write the code.
After generating intermediate code, the compiler can modify the intermediate
code by address calculations and improving loops.
While producing the target machine code, the compiler can make use of
memory hierarchy and CPU registers.
Optimization can be categorized broadly into two types: machine independent and
machine dependent.
Introduction code optimization
o Machine independent optimization:- In this optimization, the compiler takes in the intermediate code
and transforms a part of the code that does not involve any CPU registers and/or absolute memory
locations.
For example:
do{
item = 10;
value = value + item;
} while(value<100);
This code involves repeated assignment of the identifier item, which if we put this way:
item = 10;
do{
value = value + item;
} while(value<100);
should not only save the CPU cycles, but can be used on any
Introduction code optimization
Machine-dependent Optimization:-is done after the target code has been
generated and when the code is transformed according to the target machine
architecture.
It involves CPU registers and may have absolute memory references
rather than relative references.
Machine-dependent optimizers put efforts to take maximum advantage of
memory hierarchy.
Peephole optimization
• Peephole is the machine dependent optimization.
• It is a type of Code Optimization performed on a small part of the code or small
set of instruction.
• This optimization technique works locally on the source code to transform it into
an optimized code.
• These methods can be applied on intermediate codes as well as on target codes.
• It basically works on the theory of replacement in which a part of code is
replaced by shorter and faster code without change in output.
Peephole optimization (cont…)
• Objectives of Peephole Optimization:
• To improve performance
• To reduce memory footprint
• To reduce code size
• Peephole optimization techniques:
• Redundant instruction elimination
• Unreachable code
• Flow of control optimization
• Algebraic expression simplification
• Reduction in Strength
Peephole optimization technique
Redundant instruction elimination
• At compilation level, the compiler searches for instructions redundant in nature.
• Multiple loading and storing of instructions may carry the same meaning even if
some of them are removed.
• For example: MOV x, R0
MOV R0, R1
• We can delete the first instruction and re-write the sentence as:
MOV x, R1
Peephole optimization technique(cont…)
Unreachable code
• Unreachable code is a part of the program code that is never accessed because of
programming constructs.
• Programmers may have accidently written a piece of code that can never be
reached.
• Example: void add_ten(int x)
{ return x + 10;
printf(“value of x is %d”, x); }
• In this code segment, the printf statement will never be executed as the program
control returns back before it can execute, hence printf can be removed.
Peephole optimization technique(cont…)
Flow of control optimization
• There are instances in a code where the program control jumps back and forth without
performing any significant task.
• These jumps can be removed. Consider the following chunk of code:
...
MOV R1, R2
GOTO L1
...
L1 : GOTO L2
L2 : INC R1
• In this code, label L1 can be removed as it passes the control to L2.
• So instead of jumping to L1 and then to L2, the control can directly reach L2, as shown below:
...
MOV R1, R2
GOTO L2
...
L2 : INC R1
Peephole optimization technique(cont…)
• Algebraic expression simplification
• Algebraic expressions can be made simple by applying simplification rules.
• For example The expression a = a + 0 can be replaced by a itself and the
expression a = a + 1 can simply be replaced by INC a.
• Strength reduction
• There are operations that consume more time and space.
• Their ‘strength’ can be reduced by replacing them with other operations that
consume less time and space, but produce the same result.
• For example,
• x² is invariably cheaper to implement as x*x .
• 2*x is invariably cheaper to implement as x+x
Code Generation
• Code generation can be considered as the final phase of compilation.
• Through post code generation, optimization process can be applied on the code, but
that can be seen as a part of code generation phase itself.
• The code generated by the compiler is an object code of some lower level
programming language, for example, assembly language.
• A lower-level object code should have the following minimum properties:
• It should carry the exact meaning of the source code.
• It should be efficient in terms of CPU usage and memory management.
Code Generation(cont…)
Register allocation
Efficient and careful management of registers results in a fast program.
Registers are managed by the compiler, not by the programmer.
Two problems:
Register allocation: select the variables that will reside in registers.
Register assignment: pick the register that a variable will reside in. Finding
an optimal assignment of registers to variables is mathematically difficult.
In addition, the hardware / OS may require some register usage rules to be
followed.
Code Generation(cont…)
Descriptors
Register and Address Descriptors
Descriptors are used by the code generating algorithm to keep track of register
contents and addresses for names.
The code generator has to track both the registers (for availability) and
addresses (location of values) while generating the code.
1) A register descriptor: keeps track of what is currently in each register.
It is consulted whenever a new register is needed.
Initiallyall registers are empty.
2) An address descriptor: keeps track of the location(s) where the current value
of the name can be found at runtime.
The location can be a register, a memory address, a stack location, or a
combination of these.
This information can be stored in the symbol table.
Code Generation(cont…)
A Simple Code Generation Algorithm
• getReg : Code generator uses getReg function to determine the status of available
registers and the location of name values.
• getReg works as follows:
• If variable Y is already in register R, it uses that register.
• Else if some register R is available, it uses that register.
• Else if both the above options are not possible, it chooses a register that
requires minimal number of load and store instructions.
Code Generation(cont…)
A Simple Code Generation Algorithm
• The algorithm takes as input a sequence of three-address statements constituting a basic
block.
• For each three-address statement of the form x = y op z , perform the following actions:
1. Invokes a function getreg to determine the location L where the result of the
computation y op z should be stored.
2. Consult the address descriptor for y to determine y’, the current location of y. Prefer
the register for y’ if the value of y is currently both in memory and a register. If the
value of y is not already in L, generate the instruction MOV y’ , L to place a copy of
y in L.
3. Generate the instruction OP z’ , L where z’ is a current location of z. Prefer a register
to a memory location if z is in both. Update the address descriptor of x to indicate that
x is in location L. If x is in L, update its descriptor and remove x from all other
descriptors.
4. If the current values of y or z have no next uses, are not live on exit from the block,
and are in registers, alter the register descriptor to indicate that, after execution of x :
= y op z , those registers will no longer contain y or z.
Directed Acyclic Graph for Register Allocation
• Code generation from DAG is much simpler than the linear sequence of three
address code.
• DAG can be used to rearrange sequence of instructions and generate and efficient
code.
• The steps involved in the algorithm to generate code from DAG include :
• Rearranging the order – To optimize the code generation, the instructions
are rearranged and this is referred to as heuristic reordering .
• Labelling the tree for register information – To know the number of
registers required to generate code, the labels of the nodes are numbered
which indicate the number of registers required to evaluate that node.
• Tree traversal to generate code – This reordered labelled tree is traversed to
generate code based on the target language’s instruction.
Directed Acyclic Graph for Register Allocation(cont…)
• DAG consists of :
• Leaf nodes represent identifiers, names or constants.
• Interior nodes represent operators.
• Interior nodes also represent the results of expressions or the identifiers/name
where the values are to be stored or assigned