0% found this document useful (0 votes)

392 views29 pages

CD Unit-4 &5

Uploaded by

lokeshvarma0309

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

392 views29 pages

CD Unit-4 &5

Uploaded by

lokeshvarma0309

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Unit-4 Compiler Design TEC

Intermediate code generation – Variants of syntax trees, Three address code, Types and
declarations, Translation of expressions, Type checking, Control flow, Back patching.

Variants of Syntax trees

1 Q: Explain briefly about variants of syntax trees.
 Syntax tree consists of nodes and children.
i) Nodes – They represent constructs in the source program.
ii) Children – They represent the meaningful components of a construct.
Directed Acyclic Graphs (DAG) for expressions (OR) Procedure for constructing DAG
 In a given expression, the common sub-expressions (sub-expressions occurring more
than one time) are identified by Directed Acyclic Graph (DAG)
 DAG and syntax tree construction is similar.
 The difference between them is : a node N in DAG can have more than one parent if
N represents a common sub-expression. But in case of syntax tree, the common sub-
expression would be represented repeatedly as many times the sub-expression
appears in the original expression.
 For example, let us construct DAG for the expression:
e*(b–c)+(b–c)*a
Here, expression (b – c ) is repeated twice, called common sub-expression.
Step 1
First let us construct DAG for the common sub-expression, ( b –c )

Step 2
Construct DAG for e * ( b – c )

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 1

Unit-4 Compiler Design TEC

Step 3
Again no need to construct DAG for ( b – c ). Hence consider ( b – c ) * a

Step 4
Finally draw DAG for the expression e * ( b – c ) + ( b – c ) * a

 The steps for constructing DAG are as follows.

Symbol Operation
b P1 = mkleaf(id, ptr to entry b )
c P2 = mkleaf(id, ptr to entry c )
- P3 = mknode(-, P1 , P2 )
e P4 = mkleaf(id, ptr to entry e )
* P5 = mknode(*, P3 , P4 )
a P6 = mkleaf(id, ptr to entry a )
* P7 = mknode(*, P3, P6 )
+ P8 = mknode(*, P5 , P7 )

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 2

Unit-4 Compiler Design TEC

 Consider the following SDD to produce syntax tree or DAG

SNO PRODUCTION SEMANTIC RULE
1. EE+T E.Node = new Node ( +, E.Node, T.Node )
2. EE-T E.Node = new Node ( -, E.Node, T.Node )
3. E  T E.Node = T.Node
4. T ( E ) T.Node = E.Node
5. T id T.Node = new Lean( id, id.entry )
6. T digit T.Node = new Leaf( digit, digit.val )

Value Number method for constructing DAGs

 Array of records can be used to store nodes of syntax trees or DAGs.
 Each row of the array represents one record, and therefore one node.
 The first field is an operation code which indicates the label of a node. This is
present in each record.
 Leaves consist of one additional field to hold the lexical value (either a pointer to
symbol table or constant).
 Interior nodes consist of two additional fields to represent the left and right child.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 3

Unit-4 Compiler Design TEC

2 Q: Explain about Three address code (TAC or 3AC)

 TAC is a sequence of statements of the general form
x = y op z
Where x,y,z are names, constants or compiler generated temporaries. op stands for
any operator such fixed or floating point arithmetic operator or logical operator on
Boolean valued data. Here three address refers to three addresses ie addresses of x, y
and z.
 For example, if x = a + b * c then the TAC is,
t1 = b * c
t2 = a + t1
x = t2
Where t1 and t2 are compiler generated temporary variables.
Addresses and Instructions
These are different types of TAC:
 Assignment statements are of the form x = y op z
 Assignment instructions are of the form x = op z ( is unary operation like unary
minus)
 Copy statements are of the form x = y (value of y is assigned to x).
 Unconditional jumps such as
if x relop y goto L
(relational operators <>, <, >, <=, >=, = )
 Param x and call p, n for procedure calls and return y where y represents a
returned value (optional).
Param X1
Param X2
Param Xn
 The following are the different types of TAC
1. x = y op z Assignment ( op = binary arithmetic or logical
operation)
2. x = op y Unary assignment ( op = unary operation )
3. x=y Copy ( x is assigned the value of y )
4. goto L Unconditional jump

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 4

Unit-4 Compiler Design TEC

5. if x relop goto L Conditional jump ( relop is <>, <, >, <=, >=, = )
6. Param x Procedure call
7. call p, n Procedure call
8. return z Procedure call
9. a=b[i] Index assignment
10. x[i]=y Index assignment
11. x = &y Pointer assignment
x = *y
12. *x = y Pointer assignment

3 Q: Briefly discuss about the implementations of TAC or Structure of 3AC.

 Three address code is an abstract form of intermediate code that can be
implemented as a record with the address fields.
 There are three representations used for three address code.
 They are
1. Quadruple representation.
2. Triple representation.
3. Indrect triple representation.
Quadruple representation
 Quadruple is a structure with the atmost four fields such as,
op arg1 arg2 result
 The field op is sued to represent the internal code for operator.
 arg1 and arg2 represents two operands used and result field is used to store the
result of an expression.
 For example, consider the input string x = - a * b + - a * b
 The following is the three address code.
t1 = uminus a
t2 = t1 * b
t3 = uminus a
t4 = t3 * b
t5 = t2 + t4
x = t5

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 5

Unit-4 Compiler Design TEC

 The following is the quadruple representation.

op arg1 arg2 result
uminus a t1
* t1 b t2
uminus a t3
* t3 b t4
+ t2 t4 t5
= t5 x

Triple representation
 In this, the use of temporary variables is avoided by referring the pointers in the
symbol table.
 Triple is a structure with the atmost three fields such as,
op arg1 arg2
 For example, consider the input string x = - a * b + - a * b
 The following is the three address code.
t1 = uminus a
t2 = t1 * b
t3 = uminus a
t4 = t3 * b
t5 = t2 + t4
x = t5
 The following is the triple representation.
op arg1 arg2
uminus a
* (0) b
uminus a
* (2) b
+ (1) (3)
= x (4)

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 6

Unit-4 Compiler Design TEC

Indirect triple representation

 In this, it uses additional array to list the pointers of triple in the desired order.
 For example, consider the input string x = - a * b + - a * b
 The following is the three address code.
t1 = uminus a
t2 = t1 * b
t3 = uminus a
t4 = t3 * b
t5 = t2 + t4
x = t5
 The following is the Indirect triple representation.

(0) ( 11 ) op arg1 arg2

(1) ( 12 ) uminus a
(2) ( 13 ) * ( 11 ) b
(3) ( 14 ) uminus a
(4) ( 15 ) * ( 13 ) b
(5) ( 16 ) + ( 12) ( 14 )
= x ( 15 )

4 Q: Briefly explain about Types and Declarations

 A complier has to perform semantic checking along with syntactic checking.
 Semantic checks can be static (during compilation) or dynamic (during run time).
 The application of types is grouped under checking and translation.
Type checking
 The process of verifying and enforcing the constraints of types – type checking –
may occur either at compile time (static check) or run time (dynamic check).
 A “type checker” implements the type system.
 A “type system” is a collection of rules for assigning type expressions to the parts of
a program.
Translation applications
 Compiler can determine the storage required from the type of a name at run time.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 7

Unit-4 Compiler Design TEC

Type expressions
 Type expression denotes the type of a language construct. It can be a basic type Eg:
primitive data type like int, real, char, boolean etc., or a type error or a void: no
type.
Type equivalence
 Type equivalence contains two important concepts.
 Name equivalence have the same name. For example, a, b have same type and c has
different type.
 Structural equivalence have the same structure. For example, a, b and c have the
same type.
Declaration
 Whenever a declaration is encountered, then create a symbol entry for every local
name in a procedure.
 The symbol table entry should contain type of name, how much storage the name
requires and a relative offset.
Storage layout for local names
 If we know the type of a name, we can determine the amount of storage required for
the name at run time.
 Type and relative address are saved in the symbol table entry for the name.
 The ‘width’ of type is defined as the number of storage units required for objects of
that type.
 SDT is used to compute types and their width for basic and array types.
5 Q: Explain about Type checking.
Rules for Type Checking (TC)
 Type Checking can be Type Synthesis (TS) and Type Inference (TI).
 Type Synthesis (TS) uses the types of its subexpressions in order to build the type of
expression. TS require names to declare before they are used.
 Type Inference (TI) is used to determine the type of a language construct from the
way it is used.
Type conversions
 Type conversion rules differ from one language to another. Java uses widening and
narrowing conversions between primitive types.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 8

Unit-4 Compiler Design TEC

Unification
 Testing equality of expressions is the concept of unification.
6 Q: Explain about Control Flow in Intermediate code generation.
 The main idea of converting any flow of control statements to 3AC is to stimulate
the “branching” of the control flow.
 Boolean expression in programming languages are often used to:
1. Alter the flow of control.
2. Compute logical value.
 Flow of control statements may be converted to 3AC by using following functions:
1. new label – returns a new symbolic label each time it is called.
2. gen ( ) – generates the code (string) passed as parameter to it.
 The following attributes are associated with non-terminals for the code generation:
1. code– contains the generated 3AC.
2. true – contains the label to which a jump takes place if the Boolean expression
associated evaluates to true.
3. false – contains the label to which a jump takes place if the Boolean expression
associated evaluates to false.
4. begin – contains the label / address pointing to the beginning of the code block
for the statement ‘generated’ by the non-terminal.
Boolean expressions
 Boolean expressions consist of Boolean operations like AND (&&), OR ( || ) and
NOT ( ! ) as in C applied to the elements that are Boolean variables or relational
expressions of the form E1 rel E2
E1 rel E2

Arithmetic <. <=, >>=, ! Arithmetic

expression expression

 For example, E  E && E | E || E | !E | (E) | E1 rel E2 | true | false

Short circuit code
 Given Boolean expression can be translated into 3AC without generating code for
any of the boolean operators and without having the code necessarily evaluate the
entire expression. This is called as Jumping code or short circuit code.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 9

Unit-4 Compiler Design TEC

 For example, if ( a < 10 || a > 15 && a != b )

x = 1;

Translate this to jumping code or short circuit code

if a < 10 goto L1
if false a > 15 goto L2
if false a != b goto L2
L1 : x = 1
L2 :
Flow of control statements
 The main idea of converting any flow of control statement to the 3AC is to stimulate
the “branching” of the flow of control using the goto statement.
 Consider the following grammar,
S  if E then S1 | if E then S1 else S2 | while E do S1 ;
 The simulation of the flow of control branching for each statement is depicted
pictorially as follows:
if – then

if – then else

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 10

Unit-4 Compiler Design TEC

while – do

7 Q: Explain about Backpatching.

 The main problem in generating 3AC in a single pass is that we may know the labels
that control must go to at the time jump statements are generated.
 In order to get rid of this problem, a series of branching statements with the targets
of the jumps temporarily left unspecified is generated.
 Back patching is putting the address instead of labels when the proper label is
determined.
One pass code generation using back patching:
 Back patching algorithms perform three types of operations:
1. Makelist ( i ) – Creates a new list containing only ‘ i ‘, an index into the array of
quadruples and returns pointer to the list it has made.
2. Merge ( i, j ) – Concatenates the lists pointed by ‘i’ and ‘j’ and returns a pointer
to the concatenated list.
3. Back patch( P, i ) – Inserts ‘i’ as the target label for each of the statements on the
list pointed by P.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 11

Unit-5 Compiler Design TEC

Runtime environments – Stack allocation of space, Access to non-local data on the stack,
Heap management.
Code generation – Issues in the design of code generation, The target language, Addresses
in the target code, Basic blocks and flow graphs, A simple code generator.

1 Q: Stack allocation of space

 Stack area is used to allocate space to the local variables whenever the procedure is
called. This space is popped off the stack when the procedure terminates.
Activation Trees
 Activation trees are used to describe the nesting of procedure calls to make the stack
allocation feasible efficiently.
 The nesting of procedure calls can be illustrated using the following quicksort
example.
1. program sort(input,output)
2. var a : array[ 0 .. 10 ] of integers
3. procedure readarray
4. var i : integer
5. begin
6. for i = 1 to 9 do read array( a [ i ] )
7. end
8. function partition( y , z : integer ) : integer
9. var i , j , x , v : integer
10. begin ….
11. end
12. procedure quicksort( m ,n : integer )
13. var i : integer
14. begin
15. if ( n > m ) then begin
16. i = partition ( m , n )
17. quicksort( m, i – 1 )
18. quicksort( i + 1 , n )
19. end
20. end

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 1

Unit-5 Compiler Design TEC

21. begin
22. a [ 0 ] := -9999, a [ 10 ] := 9999
23. readarray
24. quicksort(1, 9)
25. end
 The following is the activation tree corresponding to the output of quicksort.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 2

Unit-5 Compiler Design TEC

Activation Record ( AR )
 Activation Record is a memory block used for information management for single
execution of a procedure.
 The following is the activation record ( Read from bottom to top )
Actual parameters
Returned values
Control link
Access link
Saved machine status
Local data
Temporaries

Temporaries – It hold temporary values, such as the result of mathematical

calculation, a buffer value or so on.
Local data – It belongs to the procedure where the control is currently located.
Saved machine status- It provides information about the state of a machine just before
the point where the procedure is called.
Access link – It is used to locate remotely available data. This field is optional.
Control link – It points to the activation record of the procedure which called it, ie the
caller. This field is optional. This link is also known as dynamic link.
Return value – It holds any valued returned by the called procedure. These values can
also be placed in a register depending on the requirements.
Actual parameters – These are used by the calling procedures.

2 Q: Access to non-local data on the stack

 Non-local data is a parameter that is used within a procedure. There are various
situations of access to non-local data. These include:
Data access without nested procedures
For non-nested procedures, variables are accessed as follows:
 Global variables are declared statically as their values and locations are known or
fixed at compile time.
 Local variables are declared locally at the top of the stack using stack top pointer.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 3

Unit-5 Compiler Design TEC

A language with nested procedure declarations – ML

Properties of ML include:
 The variables declared in ML are unchangeable once they are initialized. This means
that ML is a functional language.
 Variable declaration is done using the statement:
val <name> = <exp>
 The syntax for defining functions is,
fun <name> ( <arg> ) = <body>
 Function bodies are defined as,
let <definitions> in <statements> end
Nesting depth
 Nesting Depth defines the level from the start of a procedure at which a particular
nested procedure is defined.
Access link
 It is used to locate any variable or procedure in the case of nested procedures.
Displays
 Displays are used for efficient and easy access of non-local data by avoiding the use of
long chains of activation links, in situations having large network depths.
 Display uses auxiliary array called d.
3 Q: Heap Management
 Heap is the unused memory space available for allocation dynamically.
 It is used for data that lives indefinitely, and is freed only explicitly.
 The existence of such data is independent of the procedure that created it.
The memory manager
 Memory manager is used to keep account of the free space available in the heap area.
Its functions include,
 allocation
 deallocation
 The properties of an efficient memory manager include:
 space efficiency
 program efficiency
 low overhead

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 4

Unit-5 Compiler Design TEC

The memory hierarchy of a computer

 Typical sizes Typical access times
2GB Virtual memory (disk) 3 – 15 ms
256MB – 2 GB Physical memory (RAM) 100 – 150 ms
128Kb – 4 MB 2nd Level Cache 40 – 60 ns
16 – 65KB 1st Level Cache 5 – 10 ns
32 words Registers 1 ns
Locality in programs
Locality refers to the amount of data requirements for a particular program, and the
times required to access or locate the data. Locality is of two types namely,
 Temporal locality – This is present if the memory locations accessed by it are likely to
be accessed again soon.
 Spatial locality – This is the condition when the memory locations close to the location
accessed are likely to be accessed within a short period of time.
4 Q: Issues in the design of code generation
Instruction selection
 The job of instruction selector is to do a good job overall choosing which instructions
to implement which operator in the low level intermediate representation.
 Issues here are:
 level of the IR
 nature of the instruction set architecture
 desired quality of the generated code
 Target Machine:
 n general purpose registers
 instructions: load, store, compute, jump, conditional jump
 various addressing mode
 indexed address, integer indexed by a register, indirect addressing and
immediate constant.
 For example, X = Y + Z we can generate,
LD R0, Y
ADD R0, R0, Z
ST X, R0

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 5

Unit-5 Compiler Design TEC

Register allocation
 Instructions involving register operands are usually shorter and faster than those
involving operands in memory. Therefore, efficient utilization of registers is
particularly important in generating good code.
 The most frequently used variables should be kept in process register for faster access.
 The use of registers if often subdivided into two sub problems:
 During “register allocation”, we select the set of variables that will reside in
register at a point in the program.
 During “register assignment”, we pick the specific register that a variables will
reside in.
 For example, consider t1 = a * b
t2 = t1 + c
t3 = t2 / d
optimal machine code sequence is,
L R1, a
M R1, b
A R2, c
D R2, d
ST R1, t3

5 Q: The Target Language

 The output of Code Generation (CG) is the target language.
 The output may take different forms like absolute machine language, relocatable
machine language or assembly language.
Simple target machine model
The possible kinds of operations are listed below:
1. Load operation ( LD )
 General format is LD dst, addr.
 LD loads the value in location ‘addr’ into location ‘dst’. Here dst is destination.
 For example, LD R1, x  loads the value in location x into register R1.
2. Store operation ( ST )
 General format is ST dst, src.
 For example, ST x, R1  stores the value in register R1 into the location x.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 6

Unit-5 Compiler Design TEC

3. Computation operation
 The general form is OP dst, src1, src2 where OP is a operator like ADD, SUB.
 For example, ADD R1, R2, R3  adds R2 and R3 values and stores into R1.
4. Unconditional jump ( BR )
 The general form is BR L where BR is branch. This causes control to branch to the
machine instruction with label L.
5. Conditional jump
 The general form is Bcond R, L where R is a register, L is label and cond stands for
any of the common tests on values in register R.
 For example, BGTZ R2, L  This instruction causes a jump to label L if the value in
register R2 is greater than zero and allows control to pass to the next machine
instruction if not.
6 Q: Addresses in the target code
 In this, it explains storage allocation strategies namely static and stack allocation
strategies.
Static allocation strategy
 In static allocation names are bound to the storage as the program is compiled. Since
bindings don’t change at runtime, its names are bound to the same storage whenever
the procedure is activated.
 The compiler must decide where the activation record is to go, relative to the target
code. Once this decision is made, the position of each activation record and the storage
for each name in the record is fixed.
 Compiler gives the address of code which is required by the target code.
Limitations:
 The size of data objects must be known at compile time.
 Recursive procedures are not allowed because all the activations of a procedure use
the same bindings for local names.
 Data structures cannot be created dynamically since there is no mechanism for
storage allocation at runtime.
Stack allocation strategy
 In stack allocation, storage is organized as stack and activation records are pushed
and popped as the activation begins and ends respectively.
 Storage for locals in each call of a procedure is contained in the activation record.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 7

Unit-5 Compiler Design TEC

 The values of locals are deleted when activation ends.

 Consider the stack in which size of all activation records are known at compile time.
Consider a register “top” which points the top and increments at run time. Activation
record is allocated and deallocated by incrementing and decrementing the top with
size of the record.
7. Basic blocks and Flow graph
Basic blocks
 Basic block is sequence of constructive statements in which flow of control enters at
the beginning and leaves at the end without halt or possibility of branching except at
the end.
 Basic blocks are constructed by partitioning a sequence of three address instruction.
Algorithm for partitioning into basic blocks:
INPUT:- sequence of three address instructions
OUTPUT:- list of basic blocks
METHOD:
Step1
The first step is to determine the set of leaders. The rules to obtain the leaders are
1. The first statement is a leader.
2. Any statement which is the target of conditional or unconditional GOTO is a
leader.
3. Any statement which immediately follows the conditional GOTO or unconditional
GOTO is a leader.
Step2
For each leader, construct the basic blocks which consists of the leader and all the
instructions up to but not including the next leader or the end of the intermediate
program.
 For example, let us construct the basic blocks for the following:
1. i = 0
2. if ( i > 10 ) goto 6
3. a[ i ] = 0
4. i = i + 1
5. goto 2
6. end

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 8

Unit-5 Compiler Design TEC

 Let us apply step1 and step2 of algorithm of algorithm to identify basic blocks.
Step1
1. i = 0
2. if ( i > 10 ) goto 6
3. a[ i ] = 0
4. i = i + 1
5. goto 2
6. end
 Based on rule 1 of step1, 1 statement is leader.
 Based on rule 2 of step1, 2 and 5 are leaders.
 Based on rule 3 of ste1, 3 is a leader.
Step2
 For each leader, construct the basic blocks.
L 1. i = 0 B1

L
2. if (i > 10 ) GOTO 6 B2

L 3. a[i] = 0
4. i = i + 1
5. GOTO 2 B3

L 6. end B4

Flow graph
 Flow graphs are used to represent the basic blocks and their relationship by a directed
graph.
 There exists an edge from block 1 to block2 iff it is possible for the first instruction in
block2 to immediately flow to the last instruction in block1.
 For example, Let us write the flow graph for the following basic blocks.

1. i = 0 B1

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 9

Unit-5 Compiler Design TEC

2. if (i > 10 ) GOTO 6 B2

4. a[i] = 0
5. i = i + 1
6. GOTO 2 B3

3. end B4

 Flow graph for these basic blocks is:

B1 1. i = 0

B2 2. if (i > 10 ) GOTO B4

B3 4. a[i] = 0
5. i = i + 1
6. GOTO B2

B4 3. end

8 Q Code Generation algorithm

Machine instruction for operations
For X = Y + Z do the following:
 Use getReg ( X = Y + Z ) to select registers for X, Y and Z. Let the registers be Rx,
Ry and Rz.
 If Y is not in Ry then issue an instruction LD Ry, Y. Here Y is one of the memory
location for Y.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 10

Unit-5 Compiler Design TEC

 Similarly if Z is not in Rz, issue an instruction LD Rz, Z. Here Z is a memory

location of Z.
 Issue the instruction ADD Rx, Ry, Rz
Machine instructions for copy statements
 A three address copy statement can be of the form x = y. This is a special case where
we assume that ‘getReg’ will always choose the same register for both x and y.
Ending the basic block
 When the block ends, we need not have to remember the variable value if it is
temporary. In this case, assume that its register is empty if the variable is temporary
and used only within the block.
 Then generate the instruction ST x, R where R is a register in which x’s value exists
at end of the block.
Managing the Register and Address descriptors
1. Load operation ( LD )
 General format is LD dst, addr.
 LD loads the value in location ‘addr’ into location ‘dst’. Here dst is destination.
 For example, LD R1, x  loads the value in location x into register R1.
2. Store operation ( ST )
 General format is ST dst, src.
 For example, ST x, R1  stores the value in register R1 into the location x.
3. Computation operation
 The general form is OP dst, src1, src2 where OP is a operator like ADD, SUB.
For example, ADD R1, R2, R3  adds R2 and R3 values and stores into R1.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 11

Unit-6 Compiler Design TEC

Machine Independent code optimization – Principle sources of optimization, Peephole

optimization, Introduction to data flow analysis.
Principle sources of optimization
1 Q: Explain Principle sources of optimization techniques (OR) Function preserving
transformations).
The following are the principle sources of optimization techniques or function preserving
transformations:
a) Elimination of common sub expression
b) Copy propagation
c) Dead code elimination
d) Constant folding
e) Loop optimizations
Elimination of common sub expression
 Common sub expressions can be either eliminated locally or globally.
 Local common sub expressions can be identified in a basic block.
 Hence first step to eliminate local common sub expressions is to construct a basic
block.
 Then the common sub expressions in a basic block can be deleted by constructing a
directed acyclic graph (DAG).
 For example, x = a + b * ( a + b ) + c + d
The following is the basic block.

t1 = a + b
t2 = a + b
t3 = t1 * t2
t4 = t3 + c
t5 = t4 + d
x = t5

 The local common sub expression in the above basic block are
t1 = a + b
t2 = a + b
 Hence these local common sub expressions can be eliminated. The block obtained
after eliminating local common sub expression is shown below.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 1

Unit-6 Compiler Design TEC

t1 = a + b
t2 = t1 * t1
t3 = t2 + c
t4 = t3 + d
x = t4

Copy or Variable propagation

 Consider the assignment statement of the form x = y. The statement x = y is called as
copy statement.
 To explain copy propagation, take the following example.
a) B1 b=a+d B2 b=a+d
c=b h=b

B3 j=a+d

b) B1 B2
b=a+d b=a+d
c=b h=b

B3 j=b

 Here the common sub expression is j = a + d. When a + d is eliminated, it uses j = b.

Dead code elimination
 A piece of code which is not reachable and never used anywhere in the program, then
it is said to be dead code, and can be removed from the program safely.
 Generally copy statements may lead to dead code. For example,
x=b+c
z=x
…
d=x+y
 Suppose z variable is not used in the entire program, the z =x becomes the dead code.
Hence, it can be optimized as,

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 2

Unit-6 Compiler Design TEC

x=b+c
…
d=x+y
Constant folding
 In folding technique, the computation of a constant is done at compile time instead of
execution time and further the computed value of the constant is used.
 For example, k = ( 22 / 7 ) * r * r
Here folding is done by performing the computation of ( 22 / 7 ) at compile time. So it
can be optimized as,
k = 3.14 * r * r
Loop optimizations
 The major source of code optimization is loops, especially the inner loops.
 Most of the run time is spent inside the loops which can be reduced the number of
instructions in the inner loop.
 The following are the loop optimization techniques.
1. Code motion.
2. Elimination of induction variables.
3. Strength reduction.
1. Code motion:-
 Code motion is a technique which is used to move the code outside the loop.
 If there exists any expression outside the loop which the result is unchanged even after
executing the loop many times, then such expression should be placed just before the
loop.
 For example, while ( x ! = max =3 )
{
x = x + 2;
}
Here the expression max -3 is a loop invariant computation. So this can be optimized
as follows:
k = max -3;
while ( x ! = k )
{
x = x + 2;
}

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 3

Unit-6 Compiler Design TEC

2. Elimination of induction variables:-

 An induction variable is a loop control variable or any other variable that depends on
the induction variable in some fixed way.
 It can also be defined as a variable which is incremented or decremented by a fixed
number in the loop each time is executed.
 For example, for( i = 0, j = 0, k =0; i < n; i++ )
{
printf(“%d”, i );
}
 There are three induction variables i, j, k used in the for loop. So each time i is used
but j and k are not used. Hence the code can be optimized after elimination of unused
induction variables is given below.
for( i = 0; i < n; i++ )
{
printf(“%d”, i );
}
3. Strength reduction:-
 It is the process of replacing expensive operations by the equivalent cheaper
operations on the target machine.
 For example, for( k = 1; k < 5; k++)
{
x = 4 * k;
}
 On many machines, multiplication operation takes more time than addition or
subtraction. Hence, the speed of the object code can be increased by replacing
multiplication with addition or subtraction.
for( k = 1; k < 5; k++)
{
x = x + 4;
}

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 4

Unit-6 Compiler Design TEC

2 Q: Explain briefly about peephole optimization techniques.

 Peephole optimization is simple but effective method used to locally improve the
target code by examining a short sequence of target instructions known as peephole
and then replace these instructions by a shorter and/or faster sequence of instructions
whenever required.
 The following are peephole optimization techniques:
1. Elimination of redundant instructions.
2. Optimization of flow of control or elimination of unreachable code.
3. Algebraic simplifications.
4. Strength reduction.
1. Elimination of redundant instructions:-
 This includes elimination of redundant load and store instructions.
 For example, MOV R1, A
MOV A, R1
 Here first instruction is storing the value of A into register R1 and second instruction
is loading R1 value into A.
 These two instructions are redundant so eliminate instruction (2) because whenever
instruction (2) is executed after (1), it is ensured that the register R1 contains A value.
2. Optimization of flow of control or elimination of unreachable code:-
 An unlabeled instruction that immediately follows an unconditional jump can be
removed.
 For example,
i=j
if k = 2 goto L1
goto L2
L1: k is good
L2:
Here L1 immediately follows unconditional jump statement goto L2. Then the code
after elimination of unreachable code is
i=j
if k ≠ 2 goto L2
k is good
L2:

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 5

Unit-6 Compiler Design TEC

3. Algebraic simplifications:-
 Algebraic identities that occur frequently and which is worth considering them can
be simplified.
 For example, X = X * 1 or X = 0 + X is often produced by straight forwards
intermediate code generation algorithms. Hence they can be eliminated easily through
peephole optimization.
4. Strength reduction:-
 Replace expensive statements by a cheaper one.
 For example, X2 is expensive operation. Hence replace this by X * X which is cheaper
one.
3 Q: Explain about Data flow analysis
 Data flow analysis:
– Flow-sensitive: sensitive to the control flow in a function
– Intra procedural analysis
 Examples of optimizations:
– Constant propagation
– Common sub expression elimination
– Dead code elimination
 Data flow analysis abstraction:
– For each point in the program, combines information of all the instances of the
same program point.
 Reaching Definitions

 Every assignment is a definition

 A definition d reaches a point p if there exists path from the point immediately
following d to p such that d is not killed (overwritten) along that path.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 6

Unit-6 Compiler Design TEC

 Data Flow Analysis Schema

 Build a flow graph (nodes = basic blocks, edges = control flow)

 Set up a set of equations between in[b] and out[b] for all basic blocks b
Live Variable Analysis
 Definition
– A variable v is live at point p if the value of v is used along some path in the flow graph
starting at p.
– Otherwise, the variable is dead.
 For each basic block, determine if each variable is live in each basic block

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 7

Compiler Design for B.Tech Students
No ratings yet
Compiler Design for B.Tech Students
21 pages
UNIT II-Syntax Analysis: CS416 Compilr Design 1
100% (1)
UNIT II-Syntax Analysis: CS416 Compilr Design 1
29 pages
Push Down Automata
No ratings yet
Push Down Automata
30 pages
System Software - GTU Paper Analysis (3160715)
No ratings yet
System Software - GTU Paper Analysis (3160715)
10 pages
Itws r23
No ratings yet
Itws r23
3 pages
Automata Theory & Compiler Design
No ratings yet
Automata Theory & Compiler Design
69 pages
NP-Hard and NP-Complete Overview
No ratings yet
NP-Hard and NP-Complete Overview
7 pages
F-32 Lesson Plan - Design and Analysis of Algorithm - Revised
No ratings yet
F-32 Lesson Plan - Design and Analysis of Algorithm - Revised
8 pages
Implement On A Data Set of Characters The Three CRC Polynomials - CRC 12, CRC 16 and CRC
No ratings yet
Implement On A Data Set of Characters The Three CRC Polynomials - CRC 12, CRC 16 and CRC
5 pages
CD Unit 4
No ratings yet
CD Unit 4
41 pages
DAA - Module 1
No ratings yet
DAA - Module 1
45 pages
Aim: To Implement First Pass of Two Pass Assembler For IBM 360/370 Objective: Develop A Program To Implement First Pass
No ratings yet
Aim: To Implement First Pass of Two Pass Assembler For IBM 360/370 Objective: Develop A Program To Implement First Pass
4 pages
Pushdown Automata in Context-Free Languages
No ratings yet
Pushdown Automata in Context-Free Languages
14 pages
Flat Unit-4
No ratings yet
Flat Unit-4
24 pages
Compiler Design Laboratory I
No ratings yet
Compiler Design Laboratory I
6 pages
Compiler Design Question Bank
No ratings yet
Compiler Design Question Bank
7 pages
III Year CSE FLAT Notes Unit Wise & Important Questions (R20)
No ratings yet
III Year CSE FLAT Notes Unit Wise & Important Questions (R20)
172 pages
Peephole Optimization Techniques in Compiler Design: Anul Chaudhary 131CC00304
No ratings yet
Peephole Optimization Techniques in Compiler Design: Anul Chaudhary 131CC00304
14 pages
Data Structure and Algorithms Handwritten Notes
No ratings yet
Data Structure and Algorithms Handwritten Notes
20 pages
3134201-Data Structures and Algorithms
No ratings yet
3134201-Data Structures and Algorithms
3 pages
Compiler Design Unit 1 and Unit 2 According To Jntuh Syllabus
No ratings yet
Compiler Design Unit 1 and Unit 2 According To Jntuh Syllabus
47 pages
Experiment 10:: AIM: WAP To Implement Shift Reduce Parser
No ratings yet
Experiment 10:: AIM: WAP To Implement Shift Reduce Parser
5 pages
AT&CD Important Questions Bank
No ratings yet
AT&CD Important Questions Bank
7 pages
Introduction to Turing Machines
100% (1)
Introduction to Turing Machines
23 pages
Mobile Computing Unit III
No ratings yet
Mobile Computing Unit III
17 pages
Embedded Systems Unit Vi
No ratings yet
Embedded Systems Unit Vi
18 pages
Lab Manual lp1
No ratings yet
Lab Manual lp1
41 pages
CS416 Compiler Design Course Overview
No ratings yet
CS416 Compiler Design Course Overview
31 pages
36-Register Allocation and Assignment-06-11-2024
No ratings yet
36-Register Allocation and Assignment-06-11-2024
11 pages
BCS-552 WT Lab
No ratings yet
BCS-552 WT Lab
7 pages
The Structure of A Compiler
No ratings yet
The Structure of A Compiler
2 pages
Compiler Design Course Guide
No ratings yet
Compiler Design Course Guide
114 pages
Re To DFA
No ratings yet
Re To DFA
6 pages
Type Checking in Compiler Design
No ratings yet
Type Checking in Compiler Design
22 pages
Hello World'. The Program Should XOR Each Character in This String With 0 and Displays The Result
No ratings yet
Hello World'. The Program Should XOR Each Character in This String With 0 and Displays The Result
14 pages
Unit 5 Turing Machine
No ratings yet
Unit 5 Turing Machine
53 pages
Operating Systems Unit - 5: I/O and File Management
No ratings yet
Operating Systems Unit - 5: I/O and File Management
48 pages
Advanced Data Structures & Algorithms
No ratings yet
Advanced Data Structures & Algorithms
2 pages
MP Lab Manual
No ratings yet
MP Lab Manual
19 pages
DAG - Directed Acyclic Graph
No ratings yet
DAG - Directed Acyclic Graph
8 pages
BCS351 Data Structure Lab Manual
No ratings yet
BCS351 Data Structure Lab Manual
8 pages
LR Parsing Methods
No ratings yet
LR Parsing Methods
50 pages
Push Down Automata (PDA) : Non-Deterministic PDA Deterministic PDA
No ratings yet
Push Down Automata (PDA) : Non-Deterministic PDA Deterministic PDA
20 pages
All Theory Questions
No ratings yet
All Theory Questions
2 pages
Compiler Design Handwritten Notes
No ratings yet
Compiler Design Handwritten Notes
42 pages
FLAT UNIT-4 Problems-1
No ratings yet
FLAT UNIT-4 Problems-1
24 pages
DR B R Ambedkar National Institute of Technology, Jalandhar CSPC-203, Object Oriented Programming End Semester Examination, Dec 2020
No ratings yet
DR B R Ambedkar National Institute of Technology, Jalandhar CSPC-203, Object Oriented Programming End Semester Examination, Dec 2020
2 pages
Compiler Design 6th Sem CSE Csvtu
No ratings yet
Compiler Design 6th Sem CSE Csvtu
136 pages
Theory of Computation Lab Manual
No ratings yet
Theory of Computation Lab Manual
30 pages
Introduction to Software Engineering
No ratings yet
Introduction to Software Engineering
29 pages
1.deterministic Finite Automata
No ratings yet
1.deterministic Finite Automata
46 pages
Symbol Table in Compiler Design
No ratings yet
Symbol Table in Compiler Design
732 pages
Compiler Design CS8602 Questions & Problem Answers
No ratings yet
Compiler Design CS8602 Questions & Problem Answers
15 pages
ccs366 Sta QB Question Bank
No ratings yet
ccs366 Sta QB Question Bank
18 pages
Grammar Conflict Resolution Guide
No ratings yet
Grammar Conflict Resolution Guide
18 pages
Compiler Run-Time & Memory Management
No ratings yet
Compiler Run-Time & Memory Management
22 pages
System Programming Basics and Components
No ratings yet
System Programming Basics and Components
23 pages
FAFL Pyqs
No ratings yet
FAFL Pyqs
34 pages
Macro Processor Pass-I Implementation Guide
No ratings yet
Macro Processor Pass-I Implementation Guide
12 pages
Unit-4 Chapter-2
No ratings yet
Unit-4 Chapter-2
19 pages
Overview of 8085 Microprocessor Functions
No ratings yet
Overview of 8085 Microprocessor Functions
8 pages
Lab 4: Introduction To x86 Assembly
No ratings yet
Lab 4: Introduction To x86 Assembly
14 pages
Exception-Handling Fundamentals
No ratings yet
Exception-Handling Fundamentals
25 pages
SCCAVR: AVR Microprocessor C Compiler
No ratings yet
SCCAVR: AVR Microprocessor C Compiler
40 pages
06 Software Security 3
No ratings yet
06 Software Security 3
41 pages
Combining ASM and QBASIC
No ratings yet
Combining ASM and QBASIC
5 pages
Memory Dump Analysis Anthology (Dmitry Vostokov)
No ratings yet
Memory Dump Analysis Anthology (Dmitry Vostokov)
431 pages
Os 1-5 Units Complete Notes
No ratings yet
Os 1-5 Units Complete Notes
190 pages
C Programming and Embedded Linux Interview Questions
No ratings yet
C Programming and Embedded Linux Interview Questions
564 pages
Communication in Distributed Systems
No ratings yet
Communication in Distributed Systems
16 pages
C
No ratings yet
C
83 pages
Citect V7.0 Error Codes
No ratings yet
Citect V7.0 Error Codes
13 pages
C++ Random Number Generation Guide
No ratings yet
C++ Random Number Generation Guide
12 pages
Diab Relnote
No ratings yet
Diab Relnote
33 pages
Method Overloading in Java
No ratings yet
Method Overloading in Java
84 pages
Programming Language Pragmatics: Elsevier
No ratings yet
Programming Language Pragmatics: Elsevier
45 pages
C# Programming Study Guide: Objective 1
No ratings yet
C# Programming Study Guide: Objective 1
78 pages
ICSE 05 Basic Concepts of Computer
No ratings yet
ICSE 05 Basic Concepts of Computer
219 pages
8086 Assembly Language Programming
No ratings yet
8086 Assembly Language Programming
26 pages
COAL Final - Fall 2018 - FAST LHR
No ratings yet
COAL Final - Fall 2018 - FAST LHR
10 pages
BCS302 Modified Module 4
No ratings yet
BCS302 Modified Module 4
15 pages
Attiny 13 Datasheet PDF
No ratings yet
Attiny 13 Datasheet PDF
169 pages
BasiEgaXorz - Sega Genesis BASIC Compiler - V1
No ratings yet
BasiEgaXorz - Sega Genesis BASIC Compiler - V1
38 pages
Instruction Set 8051 - v1
100% (1)
Instruction Set 8051 - v1
10 pages
PRO1 11E Data Blocks
100% (1)
PRO1 11E Data Blocks
17 pages
Microprocessor Stack Study
50% (2)
Microprocessor Stack Study
5 pages
PIC16F62X: FLASH-Based 8-Bit CMOS Microcontrollers
No ratings yet
PIC16F62X: FLASH-Based 8-Bit CMOS Microcontrollers
114 pages
Push-Pop Get - Return CFG
No ratings yet
Push-Pop Get - Return CFG
5 pages
Microprocessor Systems Guide
No ratings yet
Microprocessor Systems Guide
13 pages

CD Unit-4 &5

Uploaded by

CD Unit-4 &5

Uploaded by

Unit-4 Compiler Design TEC

Variants of Syntax trees

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 1

 The steps for constructing DAG are as follows.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 2

 Consider the following SDD to produce syntax tree or DAG

Value Number method for constructing DAGs

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 3

2 Q: Explain about Three address code (TAC or 3AC)

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 4

3 Q: Briefly discuss about the implementations of TAC or Structure of 3AC.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 5

 The following is the quadruple representation.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 6

Indirect triple representation

(0) ( 11 ) op arg1 arg2

4 Q: Briefly explain about Types and Declarations

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 7

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 8

Arithmetic <. <=, >>=, ! Arithmetic

 For example, E  E && E | E || E | !E | (E) | E1 rel E2 | true | false

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 9

 For example, if ( a < 10 || a > 15 && a != b )

Translate this to jumping code or short circuit code

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 10

7 Q: Explain about Backpatching.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 11

1 Q: Stack allocation of space

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 1

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 2

Temporaries – It hold temporary values, such as the result of mathematical

2 Q: Access to non-local data on the stack

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 3

A language with nested procedure declarations – ML

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 4

The memory hierarchy of a computer

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 5

5 Q: The Target Language

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 6

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 7

 The values of locals are deleted when activation ends.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 8

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 9

 Flow graph for these basic blocks is:

8 Q Code Generation algorithm

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 10

 Similarly if Z is not in Rz, issue an instruction LD Rz, Z. Here Z is a memory

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 11

Machine Independent code optimization – Principle sources of optimization, Peephole

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 1

Copy or Variable propagation

 Here the common sub expression is j = a + d. When a + d is eliminated, it uses j = b.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 2

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 3

2. Elimination of induction variables:-

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 4

2 Q: Explain briefly about peephole optimization techniques.

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 5

 Every assignment is a definition

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 6

 Data Flow Analysis Schema

 Build a flow graph (nodes = basic blocks, edges = control flow)

CHAKRAVARTHY, ASSOCIATE PROFESSOR, DEPARTMENT OF CSE 7

You might also like