Compiler Data-flow Optimization
Compiler Data-flow Optimization
Analysis
Aditya Thakur
July 5, 2008
Contents
1 Introduction 5
2 Preliminaries 9
2.1 Control-ow graphs . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Data-ow Analysis . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.1 Constant Propagation . . . . . . . . . . . . . . . . . . 15
2.2.2 Liveness Analysis . . . . . . . . . . . . . . . . . . . . . 15
2.2.3 Anticipatability Analysis . . . . . . . . . . . . . . . . . 16
2.3 Destructive Merge . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.1 Forward Analysis . . . . . . . . . . . . . . . . . . . . . 16
2.3.2 Backward Analysis . . . . . . . . . . . . . . . . . . . . 17
2.4 Automata Theory . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5 Overall Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 22
3 The Transformation 25
3.1 For Forward Analysis . . . . . . . . . . . . . . . . . . . . . . . 26
3.1.1 The Nave Solution . . . . . . . . . . . . . . . . . . . . 26
3.1.2 Computing The Region Of Inuence . . . . . . . . . . 29
3.1.3 The CFG Restructuring . . . . . . . . . . . . . . . . . 39
3.2 Back to Backward Analysis . . . . . . . . . . . . . . . . . . . 55
3.2.1 Computing The Region Of Inuence . . . . . . . . . . 55
3.2.2 The CFG Restructuring . . . . . . . . . . . . . . . . . 65
4 Tradeo 79
4.1 Theoretical Analysis . . . . . . . . . . . . . . . . . . . . . . . 79
4.2 Heuristic Solution . . . . . . . . . . . . . . . . . . . . . . . . . 89
5 Related Work 91
5.1 Hot Path Graph Approach . . . . . . . . . . . . . . . . . . . . 91
5.2 Other related approaches . . . . . . . . . . . . . . . . . . . . . 94
1
6 Experimental Results 95
6.1 Forward Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.1.1 Benets of Split Approach . . . . . . . . . . . . . . . . 95
6.1.2 Cost of Split Approach . . . . . . . . . . . . . . . . . . 97
6.2 Backward Analysis . . . . . . . . . . . . . . . . . . . . . . . . 97
6.2.1 Benets of Split Approach . . . . . . . . . . . . . . . . 97
6.2.2 Cost of Split Approach . . . . . . . . . . . . . . . . . . 98
7 Conclusions and Future Directions 99
7.1 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.1.1 Interprocedural Analysis . . . . . . . . . . . . . . . . . 99
7.1.2 Points-to Analysis . . . . . . . . . . . . . . . . . . . . . 100
7.1.3 Demand-driven Analysis . . . . . . . . . . . . . . . . . 100
7.1.4 Combined Restructuring for Multiple Analysis . . . . . 101
7.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
2
Abstract
Data-ow analysis is an integral part of any aggressive optimizing compiler.
We propose a framework for improving the precision of data-ow analysis in
the presence of complex control-ow. We initially perform data-ow analy-
sis to determine those control-ow merges which cause the loss in data-ow
analysis precision. The control-ow graph of the program is then restruc-
tured such that performing data-ow analysis on the resulting restructured
graph gives more precise results. The proposed framework is both simple, in-
volving the familiar notion of product automata, and also general, since it is
applicable to any forward or backward data-ow analysis. Apart from prov-
ing that our restructuring process is correct, we also show that restructuring
is eective in that it necessarily leads to more optimization opportunities.
Furthermore, the framework handles the trade-o between the increase in
data-ow precision and the code size increase inherent in the restructuring.
We show that determining an optimal restructuring is NP-hard, and propose
and evaluate a greedy heuristic.
The framework has been implemented in the Scale research compiler,
and instantiated for the specic problems of Constant Propagation and Live-
ness analysis. On the SPECINT 2000 benchmark suite we observe an av-
erage speedup of 4% in the running times over Wegman-Zadeck conditional
constant propagation algorithm and 2% over a purely path prole guided
approach for Constant Propagation. For the problem of Liveness analysis,
we see an average speedup of 0.8% in the running times over the baseline
implementation.
3
Acknowledgements
First of all, I would like to thank Prof. Govindarajan for his guidance and
innite patience. I would also like the thank Prof. Uday Khedker for his
timely discussions.
I would also like to thank, in order of appearance, Mangesh, Kaushik,
Govind, GVSK, Rajini, Mani, Subhajit, Rajesh, Rajan, Kapil, Nandu, Sujit,
Ganesh, Girish, Santosh, Mrugesh, Sripathi, Abhishek,. . ..
Finally, I would like to thank my family for their support and encourage-
ment.
4
Chapter 1
Introduction
Compiler advances double computing power
every 18 years.
Proebstings Law
Todd Proebsting
-Hey! You! Why do you keep scratching
yourself, eh?
-Because no one else knows where it itches
The Benny Hill Show
Benny Hill
It is becoming increasingly dicult to get performance improvement using
compiler optimizations, as epitomized by Proebstings Law [18]. But devel-
opers still want that extra 5-15% improvement in the running times of their
applications, and compiler optimizations are a safer alternative to manual
optimizations carried out by developers which might introduce errors [19].
Data-ow analysis [1] is an integral part of any aggressive optimizing
compiler. Information gathered using data-ow analysis is used in code op-
timizations such as constant propagation, dead-code elimination, common
sub-expression elimination, to name a few. Data-ow analysis uses a nite
abstraction of the program states and involves computing a xed-point solu-
tion for a set of equations obtained from the control-ow graph of the pro-
gram. The results of this analysis are used to guide compiler optimizations.
Thus, the analysis should be safe in order for the resulting optimizations to
be safe.
Imprecision in data-ow analysis leads to a reduction in optimization op-
portunities. The loss of data-ow precision occurs due to the approximation
or merging of diering data-ow facts along incoming edges of a control-ow
merge. In the traditional formulation of data-ow analysis, the control-ow
5
graph of the program does not change.
In this thesis, we present a new framework to overcome this impreci-
sion. We initially perform data-ow analysis to determine those control-ow
merges that cause the loss in data-ow analysis precision, which we call
Destructive Merges. The control-ow graph (CFG) of the program is then
restructured to remove the eects of such destructive merges. Performing
data-ow analysis on the resulting restructured graph gives more precise re-
sults, and eectively eliminates the destructive merges. This leads to more
optimization opportunities in the resulting CFG. Thus, we see that we use
the results of an initial data-ow analysis to restructure the CFG and improve
the precision of the data-ow analysis on the resulting CFG
1
.
The framework presented in this thesis is simple and clean, and uses
the familiar notion of product automata [13] to carry out the restructuring
transformation. Further, the framework is general since it can be instantiated
to any forward or backward data-ow analysis. This clean formulation allows
us to show that the restructuring is correct, in the sense that the original
and restructured CFGs are equivalent. Also, we prove that the restructuring
is protable in that it necessarily leads to more optimization opportunities
in the restructured CFG.
The framework we develop is not only applicable to all forward data-ow
analysis such as constant propagation, points-to analysis but also to back-
ward data-ow analysis such as liveness and anticipatability. We believe that
a compiler writer would benet from an approach based on sound theorotical
guarantees which is clean and simple to implement.
The restructuring inherently entails an increase in code size due to code
duplication i.e., the increase in precision comes at the cost of an increase in
code size. Furthermore, we show that determining an optimal restructuring
is NP-Hard and, hence, propose a greedy heuristic for restructuring. Our
framework explicitly handles the trade-o between precision and code size
increase and also makes use of low-cost basic-block prole information.
We have implemented the proposed framework in the Scale research com-
piler [21] and instantiate it with the specic problem of Constant Prop-
agation and Liveness analysis [1]. We compare our technique with the
Wegman-Zadeck conditional constant propagation algorithm [25](Base) and
with a purely path prole-guided restructuring technique [2](HPG). On the
SPECINT 2000 benchmark suite [22], our technique exposes, on an average,
3.5 times more dynamic constants as compared to the HPG technique. Fur-
ther, we observe an average speedup of 4% in the running times over Base
and 2% over the HPG technique. On the other hand, for Liveness analysis
1
Hence, the Benny Hill quote!
6
we nd that our approach gives an average speedup of 0.8% with an average
code size increase of 15%.
Contributions
These are the main contributions of this thesis.
The framework can be instantiated to any forward or backward data-
ow analysis.
The restructuring algorithm seamlessly handles complex control ows
in a program like nested loops, switch cases and so on.
We provide provable guarantees about correctness of the algorithm.
Furthermore, we guarantee an increase in optimization opportunities
in the restructured program.
We prove that obtaining the optimal restructured program is NP-Hard
and propose heuristics to solve the problem eciently.
The framework has been implemented in the Scale compiler and is
found to be better than a purely prole driven approach.
Thesis Outline
The rest of the thesis is organised as follows:
Chapter 1 Preliminaries In this chapter we introduce the notations
used in the rest of thesis. Specically, we dene out notion of
destructive merges. We also provide an algorithm which provides
an overview of our approach.
Chapter 2 Transformation In this chapter, we describe our restructur-
ing transformation for both forward and backward analysis.
Chapter 3 Trade-o Since our approach inherently entails an increase
in code size, we have to deal with the trade-o between the in-
crease in precision and increase in code size. In this chapter, we
prove that getting the optimal restructured graph is NP-Hard
and explain a greedy algorithm to tackle this problem.
7
Chapter 4 Related Work This chapter provides an overview of the re-
lated literature pertaining to improving data-ow analysis preci-
sion. Specically, it discussses the Hot Path Graph approach
Chapter 5 Experimental Results Having explained the theoretical frame-
work, we show our empirical results for the specic problems of
Constant Propagation and Liveness analysis.
Chapter 6 Conclusions and Future Directions We suggest further di-
rections in which this approach can be extended and conclude the
thesis.
8
Chapter 2
Preliminaries
I can go with the ow,
But dont say it doesnt matter, matter
anymore.
I can go with the ow,
Do you believe it in your head?
Go With The Flow
Queens Of The Stone Age
In this chapter, we introduce the notations used in the rest of the thesis.
We start with basic denitions of control ow graphs followed by denitions of
both backward and forward data-ow analysis. Using these notions we dene
destructive merges, a concept which plays a central part in our approach.
Finally, we discuss concepts from automata theory which are used to carry
out the restructuring.
2.1 Control-ow graphs
Denition 1. (Simple Graph) A simple graph is a graph that does not
have more than one edge between any two vertices and no edge starts and
ends at the same vertex.
Denition 2. (Control Flow Graph) A control ow graph (CFG)
G = (N, E, start, end) is a simple directed graph (N, E) where
the nodes N represent program statements of a function,
the edges E represent possible control ow between program statements,
the start node has no incoming edges and all nodes are reachable from
the start node
9
end node has no outgoing edges and the end node is reachable from all
other nodes.
For simplicity of exposition we consider the nodes to be individual pro-
gram statements, but they could also be basic blocks.
Let pred(n) denote the set of control-ow predecessors of a node n in the
CFG, and succ(n) denote the set of control-ow successors of a node n in
the CFG. By denition, pred(start) = , and succ(end) = .
Denition 3. (Control Flow Paths) A path in a control-ow graph
is a sequence of nodes or edges. As is usually assumed in data-ow analysis,
all possible paths in the CFG are considered to be feasible.
Denition 4. (Reachable Nodes) Given a node n reachable nodes i.e.
reachable nodes(n) is the set of nodes reachable from node n along some
path in the CFG.
Denition 5. (Generalized Post-dominator Set) A set of edges W
post-dominate vertex v, i.e. postdom(v) = W i the following two conditions
are met:
1. all paths from the vertex v to the end node contain some edge w W;
and
2. for each edge w W, there is at least one path from vertex v to the
end node which contains w and does not contain any other edge in W.
Example. Figure 2.1 shows the control-ow graph G.
pred(start) = , succ(end) = ,
pred(D) = B, C, succ(D) = E, F,
postdom(A) = (A, B), (A, C), postdom(D) = (D, E), (D, F).
2
Denition 6. (Reverse Graph) Given a simple directed graph G , the
Reverse Graph is a graph in which the direction of all edges have been re-
versed.
Denition 7. (Backward Reachable Nodes) Given a node n in CFG
G, backward reachable nodes i.e. backward reachable nodes(n) is the set
of nodes reachable from node n along some path in the reverse graph of CFG
G.
10
Figure 2.1: Control Flow Graph G.
Figure 2.2: The Reverse Graph R corresponding to the CFG G in Figure 2.1
11
Example. Figure 2.2 shows the reverse graph R of the CFG in Figure 2.1.
In graph R, reachable nodes(D) = start, A, B, C, D.
Hence, in graph G, backward reachable nodes(D) = start, A, B, C, D.
2
The program statements in a CFG usually consist of conditional state-
ments such as if statements, and assignment statements. We will also make
use of assume statements in our control-ow graphs. An assume statement
takes a predicate, say p, as an argument. An assume statement adds an as-
sumption about the predicate expression. For example, assume(y < 3) would
restrict y to the values less than 3 in all subsequent computation. Control can
never go past the assume statement if the predicates evaluates to false (F).
If the predicate p evaluates to true (T), then the program continues. But in
case the predicate cannot be evaluated then we eectively assume that the
predicate is true and continue. This might happen if the variables involved
in the assume have not been assigned to. Thus,assume statements control
the ow of the program but do not modify the program state. These assume
statements are commonly seen in work related to verication techniques, but
we will nd them useful when discussing backward analysis.
Such assume statements can replace if statements, viz.
if(e) then A; else B;
can be written as
if() then assume(e); A; else assume(e); B; .
Though at rst this might seem nondeterministic, it is easy to still main-
tain deterministic execution since the assume statements here are along the
immediate successors of the if statement.
The predicate of the assume statement seen along an edge is called the
edge predicate for the edge.
Example. Figure 2.3 shows the CFG G of Figure 2.1 with the if state-
ment if (z > 1) at node A are replaced by corresponding assume statements
assume(z > 1 = T) and assume(z > 1 = F) along outgoing edges (A, B)
and (A, C) respectively.
edge predicate((A, B)) = (z > 1 = T), (edge predicate((A, C)) = (z >
1 = F)
2
12
Figure 2.3: The CFG G of Figure 2.1 in which the if statements at nodes A
and D are replaced by corresponding assume statements along their outgoing
edges.
2.2 Data-ow Analysis
Data-ow analysis [1] is a static analysis technique used to glean information
about the program as a whole and its possible behaviors. It forms the core of
any aggressive compiler optimization framework. It is also used extensively
in program understanding and verication tools.
Any data-ow anlysis should be safe i.e. it should be conservative and
over-approximate the runtime behaviour of the program. This ensures that
the code transformations and optimization carried out using the analysis
results do not change the semantic meaning of the program. A typical over-
approximation which is carried out is that all control-ow paths in the CFG
are assumed to be possible, even if some are infeasible. An analysis which
does not consider a control-ow path which could be taken at runtime is
deemed unsafe.
The Control Flow Graph (CFG) is the principle abstraction of the pro-
gram control-ow over which data-ow analysis operates. In terms of the
actual information, data-ow analysis operates over a nite abstraction of
the possible program states, which is usually represented as a complete semi-
lattice. We assume that the data-ow information is associated with the in
13
and out of a node, where for a given node n, in is the program point im-
mediately preceding the node n, and out is the program point immediately
following the node n. Data-ow information is collected by setting up a sys-
tem of equations which relate information at various points in a program
and nding the xed point of these equations. For example, the data-ow
information at in of a node is related to that at the out of its predecessor
node.
We begin with some common concepts of data-ow analysis adopted
from [1, 2].
Denition 8. (Data-flow Analysis Framework) A Data-ow analy-
sis framework T is a tuple (L, , F, , T) where:
L is a complete semilattice with the meet operation .
F is a set of monotonic transfer functions from L to L.
is the top element in L, satisfying the law = for all in L,
T is the direction of analysis, either forward or backward.
Denition 9. (Data-flow Problem) A data-ow problem T is a tuple
(G, T, l
r
, M) where:
G = (N, E, start, end) is a control-ow graph (CFG),
T is a data-ow analysis framework,
l
s
L is the data-ow fact associated with start,
M : N F maps the nodes of G to the functions in F of T.
M can be extended to paths by composition i.e.
M([n
1
, n
2
, . . . , n
k
])
def
= M(n
k
) M(n
k1
) M(n
2
) M(n
1
)
The following denitions assume that the direction of the analysis is for-
ward, and can be naturally extended to backward analysis.
Denition 10. (Data-flow Solution) A solution o of T is a map
o : N L such that, for any path p from start to node u, o(u) _ (M(p))(l
s
).
The solution is associates data-ow facts with the in of the node n, which
we represent as in dff(n). The corresponding data-ow facts at the out of a
node n, i.e. out dff(n), can be determined by applying the nodes transfer
function to in dff(n). Thus,
out dff(n)
def
= M(n)(in dff(n)).
14
Denition 11. (Fixed Point Solution) A xed point
o of T is a map
o : N L such that
o(start) _ l
s
) and, if e is an edge from node u to node
v,
o(v) _ (M(u))(
o(u)).
Denition 12. (Maximum Fixed Point Solution) A Maximum Fixed
Point (MFP) Solution o of T is a solution of T such that, for any xed point
o,
o(u) _ o(u) for all u V .
The MFP solution can be computed by general iterative algorithms [1, 15].
2.2.1 Constant Propagation
In the Constant Propagation problem we are interested in knowing whether
a variable x always has a constant value at a program point. The use of the
variable at that point can then be replaced by that constant value. Dead-code
elimination, common sub-expression elimination, redundancy elimination are
some of the other analysis which benet from precise constant propagation
analysis.
Specically, in the Constant Propagation data-ow analysis framework (,
a variable x can have any one of the three data-ow facts associated with it:
(1) , which means that nothing has been asserted about x, (2) c
i
, which
means that x has a constant value, (3) , which means that x has been
determined to not have a constant value.
L can be thought of as a vector of data-ow values, one component for
each variable in the program. Given L and a variable x, we will use
the notation [x] to denote the data-ow value associated with x in . For
example, o(u)[x] represents the data-ow value of variable x at node u V
given a solution o to the Constant Propagation problem (.
Example. Consider Figure 2.4. We would like to know whether the use
of the variable x at node G can be replaced by a contant value. Constant
Propagation analysis would ascertain that this cannot be done.
2
2.2.2 Liveness Analysis
A variable is said to be live at a particular point in a program if there is a
path to the exit along which its value may be used before it is redened. It
is Not live if no such path exists. The values live (L) and not live (N) form
a lattice where = N and = L. Liveness analysis is a backward analysis.
15
Figure 2.4: A simple acyclic program. Node D is a destructive merge. The
use of variable x at node G cannot be replaced by a constant.
2.2.3 Anticipatability Analysis
An expressions value is said to be anticipatable on entry to block i if every
path from that point includes a computation of the expression and if placing
that computation at any point on these paths produces the same value.
2.3 Destructive Merge
2.3.1 Forward Analysis
Central to our approach, is the notion of a destructive merge. Intuitively,
a destructive merge is a control-ow merge where data-ow analysis loses
precision. The notion of precision is governed by the partial-order of the
lattice of the data-ow problem.
Denition 13. (Destructive Merge) For a given forward data-ow
problem and its corresponding solution, a control-ow merge m is said to be
a Destructive Merge if p pred(m) s.t. in dff(m) out dff(p).
Denition 14. (Destroyed Data-flow Facts) Given a destructive
merge m, a data-ow fact d is said to be a Destroyed Data-ow Fact , i.e.
16
Figure 2.5: For the problem of Constant Propagation, (a)-(b) show two de-
structive merges and (c)-(d) illustrate two control-ow merges which are not
destructive. c
1
and c
2
are two diering constants, while represents not-
constant.
d destroyed dff(m), i d out dff(p), where node p pred(m), and
in dff(m) d.
Example: For the problem of Constant Propagation, Figure 2.5 illustrates
the dierent scenarios possible at a control-ow merge. Nodes D1 and D2
are destructive, while nodes N1 and N2 are not.
More specically, in Figure 2.4 node D is a destructive merge since
in dff(D) = , while out dff(B) = x = 1. Further,
destroyed dff(m) = x = 1, x = 2.
2
2.3.2 Backward Analysis
The above denitions can be easily extended to Backward data-ow analy-
sis. Instead of control-ow merges, destructive merges for backward analysis
dened over control-ow forks.
Denition 15. (Destructive Merge) For a given backward data-ow
problem and its corresponding solution, a control-ow fork m is said to be a
Destructive Merge if p succ(m) s.t. out dff(m) in dff(p).
17
Figure 2.6: For the problem of Liveness Analysis, (a) shows a destructive
merge and (b)-(c) illustrate two control-ow merges which are not destruc-
tive.
Denition 16. (Destroyed Data-flow Facts) Given a destructive
merge m, a data-ow fact d is said to be a Destroyed Data-ow Fact , i.e.
d destroyed dff(m), i d in dff(p), where node p succ(m), and
out dff(m) d.
Example. For the problem of Liveness Analysis, Figure 2.6 illustrates the
dierent scenarios possible at a control-ow fork. Node D1 is destructive,
while nodes N1 and N2 are not.
More specically, in Figure 2.4 node D is a destructive merge since
out dff(D) = x = L, while in dff(B) = x = N. Further,
destroyed dff(m) = x = L.
2
2.4 Automata Theory
As we will see our restructuring algorithm uses many concepts from automata
theory, which we introduce next.
18
Denition 17. (Deterministic Finite Automaton) A deterministic
nite automaton is dened by the tuple (Q, , , s, F) where
Q is a nite of set, the elements of the set are called states,
is a nite set, the input alphabet,
s Q is a unique start state,
F Q, the elements of F are called the nal or accept states,
: Q Q is the transition function. Intuitively, is a function
that tells which state to move to in response to an input.
Unless mentioned otherwise, all nite automata discussed in this thesis
are deterministic.
Denition 18. (Multistep Transition Function) Given a nite au-
tomaton M(Q, , , s, F) we dened the multistep transition function
:
Q
(q, )
def
= q,
(q, xa)
def
= (
(s, x) F.
There are many operations which can be dened over nite automata.
The one which we are most interested in is the product operator .
Denition 20. (Product Automaton) Given two nite state automata
M
1
(Q
1
, ,
1
, s
1
, F
1
) and M
2
(Q
2
, ,
2
, s
2
, F
2
), the product of M
1
and M
2
de-
noted by M
1
M
2
is dened as the nite state automaton M
3
(Q
3
, ,
3
, s
3
, F
3
)
where
Q
3
= Q
1
Q
2
= (p, q) [ p Q
1
and q Q
2
,
F
3
= F
1
F
2
= (p, q) [ p F
1
and q F
2
,
s
3
= (s
1
, s
2
),
3
: Q
3
Q
3
, the transition function is dened as
3
((p, q), a) = (
1
(p, a),
2
(q, a))
19
Figure 2.7: Figure illustrating the product operation for nite automata.
Example. Figure 2.7 shows two automata A
1
and A
2
whose alphabet is
= rain, sun. The start and accept state for A
1
is state dry, while the
start state for A
2
is sleep and accept state is play. The transition function is
shown graphically in the gure.
The product automaton A
3
def
= A
1
A
2
is also shown.
2
Further, since the product operation is both associative and commutative,
we can dene the product operator for k in a natural fashion as follows:
Denition 21. (Generalized Product Automaton) The product of k
nite state automata M
1
(Q
1
, ,
1
, s
1
, F
1
), M
2
(Q
2
, ,
2
, s
2
, F
2
), . . . ,M
k
(Q
k
, ,
k
, s
k
, F
k
)
denoted by M
1
M
2
. . . M
k
is dened as the nite state automaton
M
p
(Q
p
, ,
p
, s
p
, F
p
) where
Q
p
= Q
1
Q
2
. . . Q
k
= (q
1
, q
2
, . . . , q
k
) [ q
i
Q
i
, 1 i k,
F
p
= F
1
F
2
. . . F
k
= (q
1
, q
2
, . . . , q
k
) [ q
i
F
i
, 1 i k,
s
p
= (s
1
, s
2
, . . . , s
k
),
p
: Q
p
Q
p
, the transition function is dened as
p
((q
1
, q
2
, . . . , q
k
), a) = (
1
(q
1
, a),
2
(q
2
, a), . . . ,
k
(q
k
, a))
Though simple to understand and construct, the product operator is ex-
tremely useful and enjoys the following property:
Lemma 1. (Intersection Lemma) Consider two automata A
1
and A
2
.
If P = A
1
A
2
, then
L(P) = L(A
1
) L(A
2
).
20
Product( M
1
(Q
1
, ,
1
, s
1
, F
1
), M
2
(Q
2
, ,
2
, s
2
, F
2
), . . . M
k
(Q
k
, ,
k
, s
k
, F
k
) )
// The start node of product automaton composed of start nodes of input automata.
1 s
P
(s
1
, s
2
, . . . , s
k
)
// The states of the product automaton; initially only contains the start state.
2 Q
P
s
P
// The nal nodes of the product automaton.
3 F
P
F
1
F
2
. . . F
k
// Compute the transition function for the product automaton.
// The transition function of the product automaton; initially empty.
4
P
// Worklist of states; initially contains the start state.
5 WorkList W s
P
// Do while the worklist is not empty.
6 While W ,= empty
7 do
8 state get(W) // Pop the next state from the worklist.
9 (q
1
, q
2
, . . . , l
k
) state
// For each letter in the input alphabet, compute the next state.
10 Foreach a
11 do
// Get the next state.
12 next state (
1
(q
1
, a),
2
(q
2
, a), . . . ,
k
(q
k
, a))
// Check if this is a new state.
13 If (next state / Q
P
)
14 then // Add the new state to the set of states.
15 Q
P
Q
P
next
s
tate
// Push the new state onto the worklist.
16 put(W, next state)
// Add the new transition to the transition function.
17
P
P
(state, next state)
18 return M
P
(Q
P
, ,
P
, s
P
, F
P
)
Figure 2.8: Algorithm to compute the product automaton
21
Proof. We refer the reader to [13] for the proof.
Denition 22. (Control Flow Automaton) The Control Flow Au-
tomaton
G corresponding to a CFG G(N, E, start, end) is nite automaton
(Q, , , s, F) where
the set of states Q
def
= N,
the input alphabet = E,
the start state s
def
= start,
the nal states F = end,
the transition function : Q Q is dened as
(p, a)
def
= q i node p is connected to q by the edge a in G.
This natural correspondence between control-ow graphs and nite au-
tomaton allows us to use the concepts in automata theory described in Sec-
tion 2.4. When the context is clear, we will use G to represent both the CFG
G and the control ow automaton
G.
Denition 23. (CFG Equivalence) We say that the two CFGs G
1
and
G
2
are equivalent, i.e. G
1
G
2
, if the languages accepted by
G
1
is equal to
language accepted
G
2
, i.e., L(
G
1
) = L(
G
2
).
Such a notion of equivalence suces since the restructuring only dupli-
cates nodes and does not change the semantics of the statements correspond-
ing to a node.
2.5 Overall Algorithm
Figure 2.5 presents an overview of our approach. Central to our approach
is the notion of a destructive merge. Having performed data-ow analysis,
we determine which data-ow merges are destructive. This was explained in
Section 2.3. Since our approach implicitly entails an increase in code size,
we need to determine the best set of destructive merges to eliminate. We
tackle this issue in Chapter 4. Having determined which destructive merges
have to eliminated, the Split Automaton is constructed. Denition 33 tells us
how to do this for Forward analysis, while Denition 44 applies for Backward
analysis.
22
Split( G, k)
1 Perform Data-ow analysis
//set of destructive merges
2 DM
3 Foreach merge m
4 do
5 If m is destructive
6 then
7 DM DM m
8 Foreach merge m DM
9 Calculate tness (m)
10 Foreach k ttest non-zero merge m
i
11 do
12 A
i
= split automaton(m
i
)
13 // Let A
1
, A
2
, ..., A
k
be the split automata
14 S split graph (G, A
1
, A
2
, ..., A
k
)
15 Perform Data-ow analysis on S
16 Optimize S
Figure 2.9: The Split Algorithm
23
The actual restructuring transformation is carried out using these Split
Automaton. Denition 35 denes this for Forward analysis, while Deni-
tion 46 is to be used for Backward analysis. We show that the restructured
graph is equivalent to the original.
Finally, data-ow analysis is carried out on this restructured graph and
the nal program is optimized.
24
Chapter 3
The Transformation
Paradoxical as it seemed, the Master always
insisted that the true reformer was one who was
able to see that everything is perfect as it is -
and able to leave it alone.
One Minute Wisdom
Anthony de Mello
In this chapter we describe the transformation applied to a CFG to elim-
inate a destructive merge. We will rst consider forward data-ow analysis,
and then extend the approach to handle backward data-ow analysis. Fur-
ther, in order to explain the concepts clearly, specic data-ow analysis will
be used, before stating the general framework.
We start with a program P and a data-ow analysis D. We have already
seen that destructive merges lead to loss of data-ow precision and, hence,
result in fewer optimization opportunities. We wish to transform the pro-
gram P by altering the control-ow to get an equivalent program P
in order
to eliminate the eects of such destructive merges. Performing data-ow
analysis on the program P
mM
influenced nodes(m)
Denition 30. (Region of Influence) Given a set of destructive merges
/ = m
1
, m
2
, . . . , m
k
, the Region of Inuence corresponding to / is de-
ned as
RoI(/) =
mM
RoI(m)
Extending the denitions to handle multiple destructive merges is straight
forward. A node n belongs to influenced nodes(/) i it is inuenced by at
least one destructive merge m /. Similarly, a node n belongs to RoI(/)
i it belongs to the Region of Inuence of at least one destructive merge
m /.
38
Figure 3.13: Program W1 which violates the Equivalence constraint, but not
the Ecacy constraint. The outgoing edges of node D1 are incorrect.
3.1.3 The CFG Restructuring
We continue with our example program P2 shown in Figure 3.9. Having
identied which data-ow facts should hold at node D in order to optimize
node G viz., useful dff(D, G), and which nodes should be duplicated viz.
RoI(D), we will explain the actual control-ow transformation.
There are two main constraints such a transformation should satisfy.
Equivalence. The original and transformed programs should be equiv-
alent.
Ecacy. The transformation should guarantee that the improvement
in data-ow precision leads to optimization opportunities.
The program W1 of Figure 3.13 illustrates an incorrectly transformed
program which violates the equivalence constraint. This is because the path
B D F G present in Program P2 shown in Figure 3.9 is
not present in Program W1. In Program P2, node D has outgoing edges to
nodes E and F, while node D1 in Program W1 has only one outgoing edge
to node E1.
Program W2 in Figure 3.14 satises the equivalence constraint but vio-
lates the ecacy constraint. This transformation does not expose new opti-
39
Figure 3.14: Program W2 which violates the Ecacy constraint, but not the
Equivalence constraint. Nodes G1 and G2 are destructive merges and the
use of variable x cannot be replaced by a constant.
mization opportunities, since nodes G1 and G2 are destructive merges and
the uses of variable x cannot be replaced by constants. But the equivalence
constraint is not violated.
Denition 31. (Kill Edges) Given a Region of Inuence for a destructive
merge m, an edge e = (u, v) is a Kill Edge, i.e. e kill edges(m), i
u RoI(m) and v / RoI(m).
In other words, Kill Edges are those edges whose source node is in the
Region of Inuence and target node is not in the Region of Inuence for a
destructive merge m.
Denition 32. (Revival Edges) Given a Region of Inuence for a de-
structive merge m, an edge e = (u, m) is said to be a Revival Edge, i.e.
e revival edges(m), i
out dff(u) revival dff(m).
In other words, Revival Edges are those incoming edges of the destructive
merge which correspond to Revival Data-ow facts. Further, let d
1
, d
2
, . . . d
k
be the k distinct Revival Data-ow facts for destructive merge m. We can
40
Figure 3.15: Revival and Kill Edges for a Region of Inuence corresponding
to a destructive merge m.
partition the incoming edges of destructive merge m into k + 1 equivalence
classes R
0
, R
1
, . . . , R
k
. An edge (u, m) R
i
, 1 i k if out dff(u) = d
i
,
and (u, m) R
0
otherwise.
Figure 3.15 shows a schematic diagram illustrating the above concepts.
Example. In Figure 3.9,
out dff(B) = x = 1, out dff(C) = x = 2,
revival dff(m) = x = 1, x = 2,
revival edges(m) = (B, D), (C, D),
RoI(m) = D, E, F, G,
kill edges(m) = (G, H), (G, I).
2
Lemma 2. (Post-dominance Lemma) Given a destructive merge m,
the set of edges kill edges(m) post-dominate the node m.
Proof. The notion of generalized post-dominator set can be found in Deni-
tion 5.
We claim that all paths p from the node m to the end node can be written
as q (u, v) r where
1. subpaths q, r may be empty,
2. w influenced nodes(m) s.t. w reachable nodes(u),
41
Figure 3.16: Illustrating a path p = q (u, v) r from destructive merge
m to the end node for Lemma 2. w1, w2 influenced nodes(m). w1
reachable nodes(u). w1, w2 / reachable nodes(v).
3. w influenced nodes(m) s.t. w reachable nodes(v),
4. subpath r does not contain any node w influenced nodes(m)
If we prove this claim, then it is easy to see from points 2 and 3 above that
the edge (u, v) is a Kill Edge, and it follows that all paths from destructive
merge m to the end node pass through a Kill Edge. We now prove the above
claim.
We dene u to be the last node along path p such that
w.w influenced nodes(m) w reachable nodes(u). Subpath q is the
prex of path p before this node u. Since all inuenced nodes are reachable
from the destructive merge m such a node u always exists. If node m is
chosen as u then subpath q is empty. Having found node u, node v is simply
the successor of node u along the path p. Node v may be the end node, in
which case the subpath r is empty.
Our choice of node u also implies that the subpath r of path p does not
contain any node w influenced nodes(m)
1
. This proves the claim.
Armed with the above concepts we are now ready to dene the Split
Automaton.
1
Recall u, u reachable nodes(u).
42
Figure 3.17: The Split Automaton A
D
corresponding to destructive merge
D in Figure 3.9.
Denition 33. (Split Automaton) The Split Automaton A
m
corre-
sponding to a destructive merge m is a nite-state automaton dened as:
the input alphabet = E, the set of all edges in the CFG,
a set of k + 1 states Q = s
0
, s
1
, s
2
, . . . , s
k
,
s
0
Q is the initial and accepting state,
the transition function : Q Q dened as
(s
i
, e) s
j
, e R
j
(Revival Transitions)
(s
i
, e) s
0
, e kill edges(m) (Kill Transitions)
(s
i
, e) s
i
, otherwise (No Transition)
Intuitively, state s
i
, 1 i k corresponds to Revival Data-ow fact d
i
and whenever an edge e R
i
is seen, the Split Automaton makes a transition
to state s
i
. We call this the Revival Transition. Further, whenever a Kill
edge is seen, the Split Automaton transitions to state s
0
. We call these the
Kill Transitions. In all other cases, the automaton remains in the same state,
and makes no transitions.
Example. Figure 3.17 shows the Split Automaton A
D
corresponding to
the destructive merge D in CFG P in Figure 3.9. Intuitively, whenever the
automaton is in state 1, data-ow fact x = 1 holds, while in state 2 data-
ow fact x = 2 holds. Further, a transition to state 0 from states 1 or 2
implies that the automaton has stopped duplicating nodes in the CFG.
To get further intuition into the structure of the Spit Automaton, we
consider the set of paths from the start state upto node G along which data-
ow facts x = 1 and x = 2 ow.
43
Figure 3.18: (a) The two paths along which the data-ow fact x = 1
reaches the node G. (b) The two paths along which the data-ow fact x = 2
reaches the node G.
Figure 3.18(a) shows the paths along which the data-ow fact x = 1
holds. Notice that these paths pass through the edge (B, D). In fact, these
are exactly those paths which are accepted by the state 1 of the Split Au-
tomaton shown in Figure 3.17. Similarly, Figure 3.18(b) shows the paths
along which the data-ow fact x = 1 holds. In this case, these paths pass
through the edge (C, D)and are exactly those paths which are accepted by
the state 2 of the Split Automaton.
Furthermore, if we examine the corresponding Split Graph which we con-
struct shown in Figure 3.10, we see that these two sets of paths never inter-
sect. Thus, the diering data-ow facts along these paths never merge and
we dont lose information.
Now, in the Split Automaton, there are transitions back to state 0 on
edges (G, H) and (G, I). This is because after node G it is no longer benecial
to separate the paths and we dont lose useful information when the data-ow
facts along paths merge.
2
As we have seen already, a CFG can be viewed as a nite-state automaton
with nodes of the CFG corresponding to the state of the automaton, and the
44
Figure 3.19:
edges dening the alphabet as well as the transition function. The entry and
exit node of the CFG correspond to the start and accepting states in the au-
tomaton respectively. We call this a control-ow automaton (Denition 22).
Denition 34. (Split Graph) Given a CFG P and and a Split Automa-
ton A
m
, we dene the Split Graph S to be S = P A
m
, where is the
product automaton operator [13].
Each node n in the Region of Inuence of node m in P will have multiple
copies n
i
in the Split Graph S, each corresponding to the state s
i
in the Split
Automaton. We refer the reader to the algorithm described in Figure 2.4 for
product computation.
Example. The Split Graph S2 in Figure 3.10 is obtained by performing
the product of the CFG P in Figure 3.9 and the Split Automaton A
D
in
Figure 3.17.
Figures 3.193.21 show the Split Graph being constructed during the
product computation. Figure 3.19(a) shows the start state of the product
automaton which is also the start state of the Split Graph. On the edge
(start, A) the CFG goes to node A while the automaton stays in state 0. A
new state A0 is created, and there is a transition from state start0 to state
A0 in the product automaton as seen in Figure 3.19(b). Similarly, for edges
(A, B) and (A, C), the automaton stays in state 0 and the CFG transitions
to states B and C respectively. The resulting product automaton is shown
in Figure 3.19(c).
Now, for edge (B, D), the automaton transitions uses a Revival transition
and transitions to state 1. Thus, we see the new state D1 and the transition
from state B0 to D1 in the product automaton in Figure 3.20(d). After this
the automaton stays in state 1 which gives rise to states E1, F1, and G1. We
see the same situation for when the automaton is state C0 and transitions on
edge (C, D). States D2, E2, F2 and G2 are created in the product automaton
as shown in Figure 3.20(e).
45
Figure 3.20:
Figure 3.21:
46
Figure 3.22: The split automaton A
m
for destructive merge m. State 0 is the
start and accepting state. r1 and r2 are Revival edges, and k represents the
Kill edges.
Consider state G1. The split automaton is in state 1. On seeing the
edge (G, H), the split automaton transitions back to state 0 which is a Kill
transition. Thus, state H0 is created and there is a transition from G1 to
H0. Similar, for edge (G, I). After this the automaton stays in state 0 and
state J0 and end0 are created as shown in Figure 3.21(f).
Now consider state G2 in the product automaton. The CFG is in node
G and the split automaton is in state 2. On seeing the edge (G, I), the split
automaton transitions to state 0. Thus, there is a transition from state G2
to state I0 in the product automaton. But, the state I0 already exists in
the product automaton and no new node needs to be created. Similarly, for
edge (G, H). The nal Split Graph is shown in Figure 3.21(g).
2
Lemma 3. (Subsumption Lemma) Given a CFG P with a destructive
merge m and the corresponding split automaton A
m
,
L(P) L(A
m
).
Proof. For clarity of exposition, we will restrict the split automaton A
m
to the
form shown in Figure 3.22 i.e. A
m
has only three states. The generalization
of the proof where A
m
has n states is straight forward.
The set of edges E of the CFG P form the alphabet for the automata P
and A
m
. Hence, it follows that L(P) E
and L(A
m
) E
. The set E
pP
dff(p) (3.8)
Let Q be the set of states of the product automaton A = A
1
A
2
. . .A
k
and P
q
P be the set of paths which drive A to the state q Q.
Thus, we have
pP
dff(p) =
pP
q
dff(p) (3.9)
Removing the outer meet from the right hand side of Equation 3.9
pP
dff(p) _
pP
q
dff(p), q Q (3.10)
Substituting Equation 3.10 in Equation 3.8 we get,
in dff(n) _
pP
q
dff(p), q Q (3.11)
Further in the Split graph, by denition,
in dff(n
q
) =
pP
q
dff(p) (3.12)
Using Equation 3.12 in Equation 3.11, we get
in dff(n) _ in dff(n
q
), q Q (3.13)
In other words, the Monotonicity Lemma implies that splitting and re-
structuring never decreases the precision of the data-ow solution. It might
not necessarily improve the data-ow analysis solution at all nodes, but it
never degrades the analysis result at any node. This result diers from the
Ecacy Theorem (Theorem 3) since it applies to the data-ow solution of
all nodes, and not just those inuenced by the particular destructive merge.
A very similar result is proved in [2]. In fact, the above proof does not make
53
use of any properties specic to the Split Automaton. Though simple, this
lemma will be central in proving the Ecacy result for multiple destructive
merges.
Theorem 5. (Efficacy Theorem for Multiple Destructive Merges)
Consider a CFG P with destructive merges m
1
, m
2
, . . . , m
k
and correspond-
ing split automata A
1
, A
2
, . . . , A
k
. If node n influenced nodes(m
x
),
1 x k, then in the Split Graph S, node n
x
i
, i ,= 0 can be optimized,
where S = P A
1
A
2
. . . A
k
.
Proof. Without loss of generality, let us consider the destructive merge m
1
and the corresponding Split Automaton A
1
.
Consider the Split Graph
S
1
= P A
1
(3.14)
If node n influenced nodes(m
1
), using Ecacy Theorem for single
destructive merge (Theorem 3) we can say that some d
1
useful dff(n)
holds at in of node n
i
, i ,= 0 in Split Graph S
1
.
Now consider the Split Graph
S = P A
1
A
2
. . . A
k
(3.15)
Substituting Equation 3.14 in Equation 3.15 we get
S = S
1
A
2
. . . A
k
(3.16)
Using the Monotonicity Lemma (Lemma 4), we can say that for each
node in S say n
q
which corresponds to the node n
i
, i ,= 0 in S
1
,
in dff(n
i
) _ in dff(n
q
) (3.17)
Thus,
d
1
_ in dff(n
q
) (3.18)
This implies that each node n
q
can be optimized even in the Split Graph
S.
54
3.2 Back to Backward Analysis
We now turn our attention to improving the precision of backward data-ow
analysis such as liveness analysis. One obvious dierence is that for backward
analysis the ow of information is not along, but against, the ow of con-
trol. Hence, destructive merges are control-ow forks not control ow joins.
Thus, to a certain extent the extension from forward to backward analysis is
straight-forward: instead of targeting control ow joins where information is
lost, we target control ow forks where data-ow information is merged or ap-
proximated and data-ow precision is lost. This is reected in the denition
of a destructive merge for backward data-ow analysis (Denition 15).
As before, we use a concrete data-ow analysis to explain the concepts,
though our technique is applicable to any backward data-ow analysis. We
consider Liveness Analysis.
3.2.1 Computing The Region Of Inuence
Single Destructive Merge
In this section, we discuss which nodes can and should be duplicated in or-
der to eliminate a destructive merge for backward analysis. The denitions
for Useful Data-ow Facts (Denitions 24 and 25), Inuenced Nodes (De-
nition 26), Revival Data-ow facts (Denition 27) apply both for foward
data-ow analysis and backward analysis.
Example. Consider the program in Figure 3.24 and the problem of Live-
ness Analysis. Control-ow fork D is a destructive merge since variable x is
live along outgoing edge (DE) and is not live along outgoing edge (D, F).
This, in turn, makes variable x live at nodes A and C. Thus, the assignment
at node these nodes cannot be eliminated.
useful dff(C) = x = N,
useful dff(D, C) = x = N.
This is due to the fact that if variable x is not live at node D, then x would
not be live at node C. This would enable us to optimize node C by removing
the statement from node C. Similarly,
useful dff(A) = x = N,
useful dff(D, A) = x = N.
55
Figure 3.24: Simple acyclic program P6. Node D is a destructive merge.
The statement at node B cannot be removed since variable x is live.
Figure 3.25: Duplication required to eliminate a destructive merge for back-
ward analysis.
Thus,
influenced nodes(D) = A, C,
destroyed dff(D) = x = N,
revival dff(D) = x = N.
2
Figure 3.25 shows the type of duplication required to eliminate a destruc-
tive merge D for backward analysis. We rst convert the if statements to
assume statements (Denition 2.1) and then duplicate the control-ow fork
which is a destructive merge. Note that during duplication the number of
56
Figure 3.26: The acyclic program of Figure 3.24 after we eliminate the de-
structive merge at node D by directly applying the techniques developed for
forward analysis. The if statement has been replaced by the corresponding
assume statement. The statements at node C2 and A2 can be removed since
variable x is not live. But the program control-ow is non-deterministic at
the start node.
incoming edges for node D remains the same while the number of outgoing
edges decreases.
Though at rst glance there doesnt appear to be any diculty in extend-
ing our technique to backward analysis, we show that there are a few subtle
issues which need to be handled to ensure correctness of the transformation.
To illustrate this dierence we look at Figure 3.24. We have already identi-
ed that the node D is a destructive merge. Suppose we reverse this CFG
as shown in Figure 3.27 (a) and replace the If node with the appropriate as-
sume statements. Then we carry out the same transformation we did earlier
for forward analysis. Nodes D, B, C, A are duplicated and the destructive
merge at node D is eliminated. The resulting Split Graph is shown in Fig-
ure 3.27 (b). Reversing the directions of the edges once more we get the CFG
of Figure 3.26 in which the destructive merge at node D present in program
P6 (Figure 3.24) is eliminated. As we can see, the data-ow precision has
indeed improved at nodes A2 and C2 since variable x is not live at nodes A2
and C2. We can now eliminate the assignment statements at nodes A2 and
57
Figure 3.27: (a) The reverse graph of the program P6 of Figure 3.24. Node
D is a destructive merge, (b) The split graph after the destructive merge in
(a) has been eliminated by using the transformation described for forward
analysis.
58
Figure 3.28: Graph obtained after hoisting the assume statements in Fig-
ure 3.26 to the outgoing edges of the start node.
C2.
But looking at the control-ow graph we realise that the ow of control
has now become non-deterministic! This is because the assume statements
are still placed on edges (D1 E1) and (D2 E2). At the start node,
it is uncertain as to whether the ow of control should go down the edge
(start, A1) or the edge (start, A2).
The problem arises because of the fact that there is some semantic mean-
ing associated with edge (D, E) being taken viz. the predicate (y > 0)
evaluates to True. Similarly, edge (D, F) being taken implies that (y > 0)
evaluates to False. Thus, when we split node D we have to preserve this se-
mantic meaning. In order to ensure that the program remains deterministic,
the transformation should hoist the evaluation of the predicate from node
D and place it at the appropriate place. For our example, we need to hoist
the the evaluation of the predicate to before nodes A1 and A2. The way we
do this is to lift the assume statements to edges (start, A1) and (start, A2)
as shown in Figure 3.28. We then convert the assume statments into the
corresponding if statement as shown in Figure 3.29.
But the evaluation of the predicate cannot be hoisted to any node. Care
should be taken so that the evaluation of the predicate at the new node is
the same as that at the original node. For instance, it is not possible to
lift the predicate over a statement which modies the predicate expression.
59
Figure 3.29: Graph obtained after converting the assume statement present
in Figure 3.28 into the appropriate if statement. The ow of control is now
deterministic.
Figure 3.30: Program P7.
60
Figure 3.31: Split Graph S7 with assume statements.
Consider the program P7 in Figure 3.30. The predicate (y > 0) cannot
be lifted over block B because the statement y = 1 modies the predicate
expression. Similarly, predicate (y > 0) cannot be lifted above block A. Thus,
even though it would be benecial to hoist the predicate evaluation above
node A, the predicate evaluation can only be hoisted up to edges (B, D) and
(A, C).
Keeping this restriction in mind, Figure 3.31 shows the transformed control-
ow graph with the assume statements placed on edges (B, D) and (A, C).
Figure 3.32 shows the same program where the assume statements have been
replaced with the corresponding if statements. The precision is improved at
node C and the assignment statement can be removed since variable x is not
live. Notice that the assignment statement at node A cannot be removed in
this program as was done in program P6. Thus, even though node A is inu-
enced by the destructive merge at node D, node A cannot be split because we
cant hoist the evaluation of the predicate above node A. In general, due to
the added constraint of hoisting the check present at the destructive merge,
all inuenced nodes for a destructive merge cannot be optimized. We call the
Inuenced Nodes which can be optimized taking into considering this added
constraint as Realisable Inuenced Nodes. We formalise this concept next.
Given a destructive merge, we dene the Hoist Region as that region
within which the evaluation of the predicate can be hoisted. We make use
anticipability analysis (Section 2.2.3) to dene this region.
61
Figure 3.32: Split Graph S7 with assume statements replaced with if state-
ments.
Denition 36. (Hoist Region) Given a destructive merge m with predi-
cate p, a node n is in the Hoist Region of node m, i.e. m hoist region(m),
i expression p is anticipatible at the in of node n, i.e. ant in(n, p) = 1.
Example. In Figure 3.24, node D is the destructive merge with (y > 0)
being the predicate.
ant in(D, (y > 0)) = 1,
ant in(C, (y > 0)) = 1,
ant in(B, (y > 0)) = 1,
ant in(A, (y > 0)) = 1.
Thus, hoist region(D) = A, B, C, D.
In Figure 3.30, node D is the destructive merge with (y > 0) being the
predicate.
ant in(D, (y > 0)) = 1,
ant in(C, (y > 0)) = 1,
ant in(B, (y > 0)) = 0,
ant in(A, (y > 0)) = 0.
Thus, hoist region(D) = C, D.
62
Figure 3.33: Schematic diagram showing the Split Region for a destructive
merge m.
2
Taking the Hoist Region into consideration, we dene the Realisable In-
uenced Nodes as those Inuenced Nodes which are present in the Hoist
Region. Accordingly, we dene the Split Region for backward analysis.
Denition 37. (Realisable Influenced Nodes) Given a destructive
merge m, a node n is a Realisable Inuenced Node ,
i.e. n relisable influenced nodes(m), i n influenced nodes(m)
hoist region(m).
Similar to the denition of Split Region fo forward analysis, we dene the
Split Region for backward analysis.
Denition 38. (Split Region) Given a destructive merge m, a node n
is said be in the Split Region, i.e., n split region(m),
i n backward reachable nodes(m) and there exists a node
u realisable influence nodes(m) and
u backward reachable nodes(n).
Figure 3.33 shows a schematic diagram showing split region(m) for the
destructive merge m.
63
Note that for forward data-ow analysis the Split Region equals the Re-
gion of Inuence since all inuenced nodes are realisable. Due to the inherent
nature of the transformation required for backward analysis we have to add
these extra constraints to ensure correctness.
Example. Consider Figure 3.24,
influenced nodes(D) = A, C,
hoist region(D) = A, B, C, D,
realisable influenced nodes(D) = A, C,
split region(D) = A, B, C, D.
Consider Figure 3.30,
influenced nodes(D) = A, C,
hoist region(D) = C, D,
realisable influenced nodes(D) = C,
split region(D) = C, D.
2
Multiple Destructive Merges
In this section, we extend the concepts of the previous section to handle
multiple destructive merges.
Denition 39. (Realisable Influenced Nodes) Given a set of de-
structive merges / = m
1
, m
2
, . . . , m
k
, the set of Realisable Inuenced
Nodes is dened as
realisable influenced nodes(/) =
mM
realisable influenced nodes(m)
Denition 40. (Split Region) Given a set of destructive merges / =
m
1
, m
2
, . . . , m
k
, the Split Region corresponding to / is dened as
split region(/) =
mM
split region(m)
Extending the denitions to handle multiple destructive merges is straight
forward. A node n belongs to realisable influenced nodes(/) i it is
inuenced by at least one destructive merge m /. Similarly, a node n
belongs to split region(/) i it belongs to the Split Region of at least one
destructive merge m /.
64
Figure 3.34: Revival and Kill Edges for a Split Region corresponding to a
destructive merge m.
3.2.2 The CFG Restructuring
Single Destructive Merge
As before, we use Automata Theory to carry out the transformation and
show that the restructured CFG satises the correctness and ecacy cri-
teria. Again, we dene a Split Automaton which we use to perform the
restructuring.
Denition 41. (Kill Edges) Given a Split Region for a destructive merge
m, an edge e = (u, v) is a Kill Edge, i.e. e kill edges(m), i u /
split region(m) and v split region(m).
Thus, Kill Edges are those edges whose target node is in the Split Region
and source node is not in the Split Region for a destructive merge m.
Denition 42. (Revival Edges) For a destructive merge m, all outgoing
edges are said be Revival Edges.
Notice that the denition of Revival Edges for backward analysis diers
signicantly from that for forward analysis. This new denition is moti-
vated by the need to add the appropriate assume statements. Further, let
d
1
, d
2
, . . . d
k
be the k distinct data-ow facts for destructive merge m. We
can partition the outgoing edges of destructive merge m into k equivalence
classes R
1
, . . . , R
k
. An edge (m, u) R
i
, 1 i k if in dff(u) = d
i
.
Figure 3.34 shows a schematic diagram illustrating the above concepts.
Example. In Figure 3.24,
in dff(E) = x = L, in dff(F) = x = N,
65
Figure 3.35: The split automaton A
m
for destructive merge m. State 0 is the
start and accepting state. r1 and r2 are Revival edges, and k represents the
Kill edges. P1 and P2 are the state predicates for states 1 and 2 respectively.
They are determined by the edge predicates of the Revival edges r1 and r2
respectively.
revival dff(m) = x = L, x = N,
revival edges(m) = (D, E), (D, F),
split region(m) = A, B, C, D,
kill edges(m) = (start, A).
In Figure 3.30,
in dff(E) = x = L, in dff(F) = x = N,
revival dff(m) = x = L, x = N,
revival edges(m) = (D, E), (D, F),
split region(m) = C, D,
kill edges(m) = (B, D), (A, C).
2
Figure 3.35 shows the schematic diagram of the Split Automaton for
backward analysis. The structure of the Split Automata for forward and
backward analysis is the same. The main dierence is that each transition in
the Split Automaton for backward analysis is additionally annotated with a
predicate. In particular, all transitions to state i, i ,= 0 on edge e are anno-
tated with the predicate edge predicate(e). Furthermore, let e
1
, e
2
, . . . , e
k
are all the edges which make the automaton transition to state i, i ,= 0 and
let p
1
, p
2
, . . . , p
n
be the corresponding edge predicates respectively. The state
predicate for state i is said be p
1
p
2
. . . p
n
.
66
Denition 43. (State Predicate) Let e
1
, e
2
, . . . , e
n
be the edges which
cause the split automaton to transition from state 0 to state i and p
1
, p
2
, . . . , p
n
be the corresponding edge predicates respectively. Then the state predicate
for state i, i.e. state predicate(i), is the predicate p
1
p
2
. . . p
n
.
All the self-loops in the split automaton are annotated with the true pred-
icate. This is mainly done for uniformity of exposition and for all practical
purposes can be treated no-operations. These predicates on the Split Au-
tomaton transitions are used in assume statements which will be inserted
on the edges of the resulting product automaton. The assume statements
will then be converted into corresponding conditional checks. This two step
approach is needed when we deal with multiple destructive merges being
eliminated at the same time.
Denition 44. (Split Automaton for Backward Analysis) The
Split Automaton A
m
corresponding to a destructive merge m is a nite-state
automaton dened as:
the input alphabet = E, the set of all edges in the CFG,
a set of k + 1 states Q = s
0
, s
1
, s
2
, . . . , s
k
,
s
0
Q is the initial and accepting state,
the transition function : Q Q dened as
(s
i
, e) s
j
, e R
j
(Revival Transitions)
(s
i
, e)
p
s
0
, e kill edges(m), p = state predicate(s
i
) (Kill Tran-
sitions)
(s
i
, e) s
i
, otherwise (No Transition)
the predicate annotation predicate : Q predicate dened as
if (s
i
, e) s
j
, e R
j
then predicate(s
i
, e) = edge predicate(e)
if (s
i
, e) s
0
, e R
j
then predicate(s
i
, e) = state predicate(s
i
)
predicate(s
i
, e) = T ,otherwise
Example. Figure 3.36 shows the Split Automaton corresponding to the
destructive merge D in Figure 3.30.
state predicate(1) = y > 0, state predicate(2) =!(y > 0).
The trasitions to state 1 are labelled with edge predicate((D, E)) = y > 0,
2
67
Figure 3.36: The Split Automaton A
D
corresponding to destructive merge
D in Figure 3.30.
Similar to the case for forward analysis, we dene the Split Graph for
backward analysis. The major dierence is that we work on the reverse
graph (Denition 6) of the CFG for Backward analysis.
Denition 45. (Split Graph for Backward Analysis) Given a
CFG P and and a Split Automaton A
m
, we dene the Split Graph S to be
the reverse graph of P
r
A
m
, where P
r
is the reverse graph of P and is
the product automaton operator [13].
Additionally whenever the Split Automaton transitions from state i to
state j on edge a, add an assume(p) on the corresponding edge in the Split
Graph, where p = predicate(i, a).
Example. Figure 3.31 shows the Split Graph S7 constructed when the
program P7 in Figure 3.30 is split using the Split Automaton of Figure 3.36.
Notice the assume statement placed on edge (D1, B0). The predicate for the
assume is the state predicate for state 1 in the Split Automaton.
2
Figure 3.2.2 shows the algorithm to construct the Split Graph for Back-
ward analysis for a single destructive merge. The main dierences between
the forward and backward analysis is that we rst have to reverse the input
CFG (Line 1 before performing the product. Further, during the product
computation we have to add the appropriate assume statement on the edge.
This is shown on Line 18. Finally, after the product is computed, the al-
gorithm constructs the reverse graph of the ouput Product graph as seen in
Line 22.
68
BkwdSplit( G, A
1
(Q
1
, ,
1
, s
1
, F
1
))
// Construct the reverse graph.
1 G
r
reverse(G)
// The start node of product automaton composed of
// start nodes of split automaton and reverse graph .
2 s
P
(end, s
1
)
// The states of the product automaton; initially only contains the start state.
3 Q
P
s
P
// The nal nodes of the product automaton.
4 F
P
start F
1
// Compute the transition function for the product automaton.
// The transition function of the product automaton; initially empty.
5
P
// Worklist of states; initially contains the start state.
6 WorkList W s
P
// Do while the worklist is not empty.
7 While W ,= empty
8 do
9 state get(W) // Pop the next state from the worklist.
10 (n
1
, q
1
) state
// Foreach letter in the input alphabet,compute the next state.
11 Foreach a
12 do // Get the next state.
13 q
1
1
(q
1
, a)
14 next state (
P
(n
1
, a), q
1
)
15 If (next state / Q
P
) // Check if this is a new state.
16 then
17 // Add the assume statement to edge.
18 edge(state, new state) assume(predicate(q
1
, a))
// Add the new state to the set of states.
19 Q
P
Q
P
next state
// Push the new state onto the worklist.
20 put(W, next state)
// Add the new transition to the transition function.
21
P
P
(state, next state)
22 P
r
reverse(P) //Reverse the product graph
23 return P
r
Figure 3.37: Algorithm to compute the Split Graph for Backward Analysis
for a single destructive merge.
69
Figure 3.38: Reverse graph of the Program P7 in Figure 3.30.
Figure 3.39: An example of Product computation for backward analysis.
Example. We now illustrate the workings of the the above algorithm using
the program P7 of Figure 3.30. Figure 3.38 shows the reverse graph of the
program P7 as computed at Line 1. We now compute the product of this
reverse graph and the Split Automaton of Figure 3.36.
Figure 3.39 shows the start state of the product automaton. It is com-
posed of the start state of the reverse graph and the start state of the Split
Automaton. Notice that the start state of the reverse graph is the end node
of the original CFG.
The Split Automaton remains in state 0 on following the edge (G, end).
Thus, as seen in Figure 3.39(b), a new state end 0 is created and the transition
(end 0, G0) is added to the product automaton. In a similar manner, states
E0 and F0 are created as seen in Figure 3.39(c). For clarity, we are not
70
Figure 3.40: An example of Product computation for backward analy-
sis.(contd.)
showing the trivial assume(T) statements on the edges.
Consider the state E0 in the product automaton. The Split Automaton
is in state 0 and the CFG is in node E. On seeing the edge (D, E), the split
automaton transitions to state 1 and the CFG goes to state D. Further,
the predicate y > 0 is used to create the assume(y > 0) statement on
edge (D1, E0) in the product automaton. Further, state D1 transitions to
state C1. A similar situation occurs for the state F0. as illustrated in
Figure 3.40(d).
In state 1 when the automaton sees the edge (B, D) it transitions to state
0 and the corresponding predicate is y > 0. Thus, we see the transition from
state D1 to state B0 in the product automaton with the assume statement
assume(y > 0) add on the edge. The same logic applies to state D2 of the
automaton when it see the edge (B, D). This is seen in Figure 3.41(e).
Finally, Figure 3.42(f) shows the completed product automaton. Notice
that in this CFG each edge is in the reverse direction.
Figure 3.43(g) shows the reverse graph of the graph in Figure 3.42(f).
This is the nal Split Graph returned by the algorithm in Figure 3.2.2.
2
The remaining step in the approach for backward analysis is that of con-
verting the assume statements back to the corresponding if statements. The
algorithm is described in Figure 3.2.2. The algorithm takes as input a node n
whose outgoing edges have assume statements. The goal is to replace these
assume with the corresponding if statements. The trivial case is when the
71
Figure 3.41: An example of Product computation for backward analy-
sis.(contd.)
Figure 3.42: An example of Product computation for backward analy-
sis.(contd.)
72
Figure 3.43: The reverse graph of the CFG in Figure 3.42.
there is a single outgoing edge for node n. In this case, we can simply remove
the assume statement as shown in Line 3.
The main loop of the algorithm iterates over all the predicates present in
the assume statements until all assume statements have been replaced. At
Line 8, edges (n, m
1
) and (n, m
2
) are are labelled with assume statements
which are same except that (n, m
1
) has an assume(p) while (n, m
2
) has
an assume(!p) i.e. the assume statements dier only in the predicate p.
This check is carried out at Line 7. Though this might seem an expensive
check at rst glance, note that since we are at max splitting k destructive
merges, there can only be k predicates. Furthermore, the assume statements
can be representing in the classic bit-vector notation i.e. if an edge has an
assume(p
i
) then bit i in the vector is set, else it is reset. Using this notation,
the check the Line 7 translates to nding two bit-vectors which dier in
exactly one place by simply xor-ing the bit-vectors.
Having found two such edges, a new If node is created and placed into
the CFG as seen in Line 9. Care is taken to connect the true (T) and false
(F) edges of the new If node depending on the assume statements.
Finally, at Line 10, the remaining assume statements are placed on the
edge between the original node n and the new If node n
1
. The algorithm
73
ConvertAssumes(n)
// Node n has outgoing edges with assume statements.
1 While assume statements exist
do // If n has only one outgoing edge
2 If [succ(n)[ = 1
then // Simply remove the assume statement.
3 Edge (n, succ(n))
4 Return
5 Foreach p Predicates
do //For all outgoing edges of the given node n.
6 Forall edges ( (n, m
1
), (n, m
2
) )
do // If the assume statements dier only in predicate p.
7 If (n, m
1
) = assume(p) assume(q)
and (n, m
2
) = assume(!p) assume(q)
then
8 Make new node n
1
= If (p).
9 Create new edges (n n
1
), (n
1
T
m
1
), (n
1
F
m
2
).
//Add the remaining assume statements to the new edge
10 Edge (n n
1
) assume(q).
Figure 3.44: Algorithm to convert Assume statements to If statements for
node n.
74
Figure 3.45:
continues until no more assume statements remain. The algorihm is sure to
terminate since at each stage one predicate p is eliminated.
Example. Figure 3.45(a) shows the original node n which has assume
statements on its outgoing edges. We see that edges (n, m1) and (n, m2)
satisfy the check at Line 7 of the algorithm. Figure 3.45(a) shows the result-
ing program after the assume(p) is replaced with the If(p) nodes n1. Since
edge (n, m1) had the assume(p), edge (n1, m1) is the true branch for the If.
On the other hand, since edge (n, m2) had the assume(!p), edge (n1, m2) is
the false branch for the If.
In a similar manner, edges (n, m3) and (n, m4) in Figure 3.45(b) are
replaced with the If n2 in Figure 3.46(c). Finally, edges (n, n1) and (n, n2)
are chosen and the assume(Q) is replaced by the If(Q) node n3. In this way,
all the assume statements are replaced by the corresponding If statements
as shown in Figure 3.46(d).
2
75
Figure 3.46:
76
Example. The algorithm is also used to convert the assume statements in
Figure 3.31 into If nodes as shown in Figure 3.32.
2
Using Slicing
Since the Hoist Region aects the extent to which our technique is applica-
ble, we use a restricted form of static backward slicing to not only hoist the
evaluation of the predicate p, but also other statements in the backward
slice of p. The backward slice is restriced in the sense that we only con-
sider the part of the backward slice which is present in the same basic block
as the predicate evaluation p. This implies that only straight-line code is
hoisted which do not contain complex control-ow. Though theoretically
the extension for hoisting more complex backward-slices is possible, the im-
plementation becomes unnecessarily complex. This is similar in avour to
the slicing transformation described in [5]. The main dierence is that the
method is extremely simple and clean. Instead of just the predicates which
are placed on the edges of the Split Automaton, the restricted backward slice
is placed. Thus, instead of placing only assume statements, we also have to
place this restricted backward slice.
Thus, the anticipability analysis which is carried out has to be done on
the upward exposed expressions [15] in the backward slice of the the predicate
p.
Example. Consider the basic block
1. b = c;
2. z = 2;
3. y = a + b;
4. z = z + 1;
5. If(y > 0)
The predicate is y > 0. The backward slice for the expression y > 0
contains the statements
1. b = c;
3. y = a + b;
5. If(y > 0)
The expressions which are upward exposed are a and c. y and b are not
upward exposed since there are denitions of y and b in the same basic block.
Note that since y is modied in the same basic block, it is not possible
to hoist the evaluation of the predicate y > 0 over this basic block. Using
the backward slice of the predicate and hoisting all the statements in it, the
hoist region increases in size. Thus, the scope for optimization increases.
77
2
Multiple Destructive Merges
Similar to forward analysis, it is easy to extend the above approach to elim-
inating multiple destructive merges. Consider the set of destructive merges
/ = m
1
, m
2
, . . . , m
k
which are to be eliminated. Let A
1
, A
2
, . . . , A
k
be
the corresponding Split Automata (Denition 44).
Denition 46. (Split Graph for multiple Destructive Merges
for Backward Analysis) Given a CFG P and and a set of Split Au-
tomata A
1
, A
2
, . . . , A
k
, we dene the Split Graph S to be reverse graph of
P
r
A
1
A
2
. . . A
k
, where is the product automaton operator [13].
Additionally whenever the Split Automaton transitions from state i to
state j on edge a, add an assume(p) on the corresponding edge in the Split
Graph, where p = predicate(i, a).
It is simple to extend the algorithm described in Figure 3.2.2 to handle
multiple destructive merges. The ecacy and equivalence results also hold
when eliminating multiple destructive merges. The proofs have the same
avour as that for Forward analysis. As for forward analysis, our approach for
backward analysis does not restrict the structure of the CFG. The approach
seamlessly handles loops as well.
78
Chapter 4
Tradeo
Economy is not how little one can spend,
but how wisely one can spend it.
Origin unknown
I think you can achieve anything, if you are
willing to pay the price
Vince Lombardi
The discussion till now has not dealt with the trade-o between the data-
ow precision achieved and the increase in the code size due to the restructur-
ing inherent to this approach. We have seen that the benet of eliminating a
destructive merge m is related to the number of nodes inuenced by node m.
Also, the size of the Region of Inuence of the destructive merge determines
the resulting cost in terms of increase code size. Thus, this trade-o depends
directly on the set of destructive merges we choose to eliminate to form the
Split Graph. Thus, we need to pick the best set of destructive merges which
maximizes the benet in terms of data-ow precision, for a given cost in terms
of code size. In this section, we will prove that this problem is NP-Hard.
We then proceed to describe certain heuristics for the same.
4.1 Theoretical Analysis
Before we move to the NP-Hardness result, we dene the notion of indepen-
dece among destructive merges.
79
Figure 4.1: Schematic diagram illustrating two independent destructive
merges.
Denition 47. (Independent Destructive Merges) Two destructive
merges m
1
and m
2
are said to be independent i
RoI(m
1
) RoI(m
2
) = .
Example. Consider Figure 4.1. Destructive merges m
1
and m
2
are inde-
pendent since their respective Regions of Inuence do not share any node in
common.
2
Independent destructive merges are independent is the sense that we can
analyze the code size increase due to restructuring separately, and combine
them using simple addition. This notion is formalised as the Additive Prop-
erty of independent destructive merges.
Theorem 6. (Additive Property) Consider a CFG G and two inde-
pendent destructive merges m
1
and m
2
with their respective split automata
A
1
and A
2
, then
[ N(G) [ +[ N(GA
1
A
2
) [ = [ N(GA
1
) [ +[ N(GA
2
) [
80
Figure 4.2: Schematic diagram illustrating the LHS of Equation 4.1. (a)
CFG G, (b) Split Graph S
def
= GA
1
A
2
.
81
Figure 4.3: Schematic diagram illustrating the RHS of Equation 4.1. (a)
Split Graph S
1
def
= GA
1
, (b) Split Graph S
2
def
= GA
2
.
82
Proof. Consider the equation,
[ N(G) [ +[ N(GA
1
A
2
) [ = [ N(GA
1
) [ +[ N(GA
2
) [ (4.1)
Let
S
def
= GA
1
A
2
,
S
1
def
= GA
1
,
S
2
def
= GA
2
,
R
1
def
= RoI(m
1
),
R
2
def
= RoI(m
2
).
Since the two destructive merges are independent, in Split Graph S
1
no
node in R
2
will be split, while nodes in R
1
will be duplicated to form nodes
R
1
. Similarly, in Split Graph S
2
no node in R
1
will be split, while nodes in
R
2
will be duplicated to form R
2
. This is illustrated in Figure 4.3. Further,
in Split Graph S nodes in R
1
and R
2
will be duplicated due to elimination of
destructive merges m
1
and m
2
respectively to form the same nodes R
1
and
R
2
as show in Figure 4.2. Let C represnt those nodes which lie outside both
R
1
and R
2
and hence are unaected by any restructuring.
[ N(G) [ = [ C [ +[ R
1
[ +[ R
2
[ (4.2)
[ N(GA
1
A
2
) [ = [ C [ +[ R
1
[ +[ R
2
[ (4.3)
Using Equations 4.2 and 4.3, the LHS of Equation 4.1 can be written as
[ N(G) [+[ N(GA
1
A
2
) [ = ([ C [+[ R
1
[+[ R
2
[)+([ C [+[ R
1
[+[ R
2
[)
(4.4)
Similarly, we have
[ N(GA
1
) [ = [ C [ +[ R
1
[ +[ R
2
[ (4.5)
[ N(GA
2
) [ = [ C [ +[ R
1
[ +[ R
2
[ (4.6)
Using Equations 4.5 and 4.6, the RHS of Equation 4.1 can be written as
[ N(GA
1
) [+[ N(GA
2
) [ = ([ C [+[ R
1
[+[ R
2
[)+([ C [+[ R
1
[+[ R
2
[)
(4.7)
83
Figure 4.4: Schematic diagram illustrating the suciency condition for inde-
pendent destructive merges.
Rearranging Equation 4.7, we get
[ N(GA
1
) [+[ N(GA
2
) [ = ([ C [+[ R
1
[+[ R
2
[)+([ C [+[ R
1
[+[ R
2
[)
(4.8)
From Equations 4.4 and 4.8, we see that
[ N(G) [ +[ N(GA
1
A
2
) [ = [ N(GA
1
) [ +[ N(GA
2
) [
This proves the Additive Property for independent destructive merges.
What independent destructive merges allow us to do is to calculate the
increase in code size caused due to restructuring for each destructive merge
separately, and then by simply adding these up we get the total increase in
code size due to all destructive merges used together.
We now state a sucient, but not necessary, condition for two destructive
merges to be independent.
Lemma 5. (Sufficient Condition for Independence) Two destruc-
tive merges m
1
and m
2
are independent if
reachable nodes(m
1
) reachable nodes(m
2
) = end.
84
Proof. From the Region of Inuence of a destructive merge (Denition 28)
it follows that
RoI(m
1
) reachable nodes(m
1
) (4.9)
RoI(m
2
) reachable nodes(m
2
) (4.10)
We are given that
reachable nodes(m
1
) reachable nodes(m
2
) = end (4.11)
Note that the end node cannot belong to the region of inuence since,
it has no statements, it cannot be inuenced by any destructive merge, and
since it has no successors it cannot reach any inuenced nodes.
Thus, using Equations 4.9 and 4.10 in Equation 4.11,
RoI(m
1
) RoI(m
2
) =
This implies that destructive merges m
1
and m
2
are independent.
Denition 48. The problem Split is dened by the triple (P, /, C) where:
P is a program,
/ is a set of split automata corresponding to the various destructive
merges, and
C is maximum increase in code size that is permitted due to restruc-
turing.
A solution to Split is a subset B of / such that applying B to the program
P does not increase the code-size by more than C and which maximizes the
number of inuenced nodes which can be optimized in the resulting program
P
.
Theorem 7. (Split Theorem) Split is NP-Hard.
Proof. We shall reduce Knapsack [17] to it. We are given a set 1 of n
items, each item i having a specic weight w
i
and a prot p
i
. The goal of
Knapsack is to pick a subset J 1 of the items so as to maximize the total
prot subject to the condition that the total weight is less than a specied
weight W.
Intuitively, in our reduction, picking an item i in Knapsack will corre-
spond to selecting an split automaton in the solution of Split . Thus, we con-
struct a program P, in which for each item i in Knapsack there exists a de-
structive merge D
i
and a split automaton a
i
so that [ influenced nodes(D
i
) [ =
85
Figure 4.5: The program fragment P
i
constructed corresponding the the
Knapsack item i. D
i
is a destructive merge. The number of inuenced nodes
is p
i
and the size of the region of inuence is w
i
.
Figure 4.6: The structure of program P constructed from the Knapsack
instance. Program P is composed of fragments P
i
as shown in Figure 4.5.
86
p
i
and [ RoI(D
i
) [ = w
i
. Such a program fragment P
i
corresponding to an
item i is shown in Figure 4.5.
Further, the prots and costs of items in Knapsack are independent of
each other i.e. the cost of picking an item i does not depend on whether
item j has been placed in the knapsack. To ensure this we have to have
to construct program P so that any two destructive merges D
i
and D
j
are
independent (Denition 47). The structure of program P is illustrated in
Figure 4.6. It is easy to see that for any two destructive merges D
i
and D
j
reachable nodes(D
i
) reachable nodes(D
j
) = end.
Thus, using the suciency condition for independence (Lemma 5), any two
destructive merges D
i
and D
j
are independent.
The prot p
i
obtained by picking the item i will be mapped to the number
of nodes which are inuenced by the destructive merge D
i
in P. Also, the
weight of the item i will be mapped to the size of the Region of Inuence of
destructive merge D
i
. Thus, the weight of item i corresponds increase in the
code size which occurs when the corresponding split automaton a
i
is applied.
The constraint of the total weight W of the knapsack is mapped to the
increase in code size which we are allowed in Split . Using the Additive
Property (Theorem 6) for independent destructive merges, the total increase
in code size is the sum of increases in code size due to the individual destruc-
tive merges. In particular, the increase in code size due to destructive merge
D
i
is [ RoI(D
i
) [.
It can be shown that split automaton a
i
is in the optimal solution of
Split if and only if item i is selected in the optimal solution of Knapsack
.
It is interesting to note that this hardness result does not rely on the
complexity of the underlying data-ow analysis used, since we are already
given the set of inuenced nodes and the Region of Inuence for each destruc-
tive merge. Further, the program P does not even contain any loops, and is
acyclic. Thus, restricting the problem Split any furhter does not result in a
computationally tractable problem.
Example. Consider an instance of Knapsack with three items with the
weights and prots as follows:
w
0
= 5, p
0
= 2,
w
1
= 3, p
1
= 1,
w
2
= 5, p
2
= 3.
87
Figure 4.7: Program constructed for Constant Propagation corresponding to
a particular instance of Knapsack . D
0
, D
1
and D
2
are destructive merges.
88
Fitness( m)
1 profit count(m) [influenced nodes(m)[
2 cost [RoI(m)[
3 fitness profit / cost
4 return fitness
Figure 4.8: Computing the tness of a destructive merge.
The corresponding program constructed is shown in in Figure 4.7. We
see the corresponding to each item i in Knapsack , we have a destructive
merge D
i
with
[ RoI(D
0
) [ = 5, [ influenced nodes(D
0
) [ = 2,
[ RoI(D
1
) [ = 3, [ influenced nodes(D
1
) [ = 1,
[ RoI(D
2
) [ = 5, [ influenced nodes(D
2
) [ = 3.
Further, destructive merges D
0
, D
1
and D
2
are all independent of each other.
2
4.2 Heuristic Solution
This lead us to device an aggressive greedy heuristic to solve this problem.
Our approach is based on estimating the benet obtained and cost incurred
by eliminating a destructive merge. In the absence of prole information, we
dene the tness of a destructive merge m to be
fitness(m) = [influenced nodes(m)[ / [RoI(m)[.
Otherwise, we can make use of a low-cost basic-block prole to estimate the
potential run-time benet of eliminating a destructive merge. Let count(m)
be the number of times the destructive merge was executed in the prole
run. We now dene the tness to be
fitness(m) = count(m) [influenced nodes(m)[ / [RoI(m).
In this way, frequently executed destructive merges are more likely to be
eliminated, and our approach can concentrate on the hot regions of code.
Finally, we choose the k ttest destructive merges to be eliminated. It should
89
be noted that while this heuristic method does not guarantee that the code
size increase is within some bound (C), it works well in practice.
Figure 4.2 describes the algorithm for computing the tness of a destruc-
tive merge m. Note that this algorithm applies to backward analysis as well.
90
Chapter 5
Related Work
Seek not to follow in the footsteps of wise men,
Seek what they sought.
Basho
5.1 Hot Path Graph Approach
An earlier proposal by Ammons and Larus [2] uses an acyclic path prole to
try and improve the precision of the data-ow solution along hot paths. The
approach consists of rst using a Ball-Larus path prole [3] to determine the
hot acyclic paths in the program. The next step in [2] consists of constructing
a new CFG, called the Hot Path Graph (HPG), in which each hot path is
duplicated. This duplication is carried out in such a way that data-ow facts
along the hot acyclic path do not get destroyed due to merge with facts
along other overlapping acyclic paths. Conventional data-ow analysis is
then carried out on the HPG. This approach relies on the assumption that
removing control-ow merges along hot acyclic paths improves precision of
data-ow analysis on hot path.
Consider the example code in Figure 5.1(a). Assume a path prole as
shown in Table 5.1. Figure 5.1(b) shows the resulting HPG constructed
assuming 100% coverage i.e. all taken paths are considered. Notice that in
the HPG there are no control-ow merges along any of the acyclic paths listed
in Table 5.1. For example, the two overlapping acyclic paths B C E
and B D E in Figure 5.1(a) are separated into two separate paths
B1 C2 E2 and B1 D3 E3 in the HPG. After performing
conventional data-ow analysis on the HPG, the use of the variable a at
node B0 can be replaced by the constant 0. However, the restructuring
failed to optimize the two hot paths B C E and B D E, and
91
Figure 5.1: (a) Node E is a destructive merge.Use of variable a at node B
cannot be replaced with a constant. (b)The Hot Path Graph corresponding
to program P1. Node B1 is a destructive merge, and use of variable a still
cannot be replaced by a constant.
Ball-Larus Acyclic Path Frequency
A B C E 10
B C E 60
B D E 20
B D E F 10
Table 5.1: A path prole for the example in Figure 5.1. The frequency of
the acyclic path denotes the number of times it was taken at run-time.
92
Figure 5.2: The Split Graph constructed from program P1 in which the uses
of variable a at nodes B0, B1 and B2 can be replaced by constants 0, 1 and
2 respectively.
could not replace the use at node B1 with a constant value. The destructive
merge E in the original CFG is removed in the HPG by duplicating code and
creating two copies, E2 and E3. But the eect of the destructive merge has
shifted to node B1, which is now a destructive merge since the data-ow facts
a = 1 and a = 2 owing along the incoming edges are merged at node B1.
Thus, we see that simply duplicating acyclic paths does not always guarantee
an increase in data-ow precision. Also, concentrating only on acyclic paths
implies that all loop-back edges (E2, B1 and E3, B1 in the HPG) merge at
a common loop-header (node B1 in the HPG ). Thus, loop-headers which
are destructive merges cannot be eliminated by the Ammons-Larus approach
and data-ow precision is lost in these cases.
In comparison, the Split Graph constructed by our approach is shown in
Figure 5.2. The destructive merge at node E is completely eliminated. In the
Split Graph, uses of variable a at nodes B0, B1 and B2 can be replaced by
constants 0, 1 and 2 respectively. Thus, we see that our approach eectively
handles loop structures, guarantees additional optimization opportunities,
and does not rely on expensive path prole information
1
. We compare the
HPG method with our approach quantitatively in Chapter 6.
1
For constructing the HPG the Ammons-Larus approach relies on the path prole
information which is relatively more expensive than the simple basic block prole used in
our Split Graph construction.
93
5.2 Other related approaches
In [7], an approach for complete removal of partial redundancy is described.
Data-ow analysis is used to identify those regions of code which obstruct
code motion. Code duplication and code motion are then used to eliminate
the partial redundancy. Another approach targeted for PRE is discussed
in [23]. Our approach is applicable for a more general class of data-ow
problems as compared to these.
Code restructuring need not necessarily be limited to within a proce-
dure. An extension of Ammons-Larus approach to the interprocedural case
is described in [14]. A more recent framework [24] for whole-program opti-
mization also considers code duplication to perform area specialization, which
is purely prole-driven.
There have also been several other approaches which do not restructure
the CFG in order to improve data-ow analysis precision. Holley and Rosen
presented a general approach to improve data-ow precision by adding a nite
set of predicates [12]. In [6], the precision of def-use analysis is improved by
determining infeasible paths by using a low overhead technique based on de-
tection of static branch correlations. Interestingly, path-sensitivity can also
be obtained by synthesizing the name space of the data-ow analysis [4].
Property simulation is introduced in ESP [8] and is used to verify tempo-
ral safety properties. This approach keeps track of the correlation between
property state and certain execution states. In [9], data-ow analysis is
performed over a predicated lattice. The predicates used are determined auto-
matically using a counterexample renement technique. In [10], the context-
sensitivity of the pointer analysis is adjusted based on the requirements of
the client application. These approaches are complementary to the approach
described in this paper.
94
Chapter 6
Experimental Results
I expected results.
Bernie Ebbers,former CEO, WorldCom
In this chapter, we evaluate our approach for improving data-ow analysis
precision. We have instantiated our framework for Constant Propagation and
Liveness Analysis. We have implemented our approach in the Scale research
compiler framework [21]. The framework is parameterised with the denition
of a destructive merge, which depends on the data-ow analysis used, and
on the denition of inuenced nodes, which captures the interaction between
the specic optimization and analysis.
6.1 Forward Analysis
We present experimental results for the specic problem of Constant Propa-
gation [1]. We compare our approach (Split) with the Wegman-Zadeck con-
ditional constant propagation algorithm [25](Base) and the Hot Path Graph
approach(HPG) [2] using the SPECINT 2000 benchmark suite [22].
6.1.1 Benets of Split Approach
We instantiate the constant propagation phase of the O1 pass of the Scale
compiler with the default approach (Base), the HPG approach, and Split.
The HPG approach uses a path prole generated using the train inputs for
the respective programs, while the our Split approach uses a basic block
prole from the same train inputs. The benchmarks are compiled for DEC
ALPHA and were run on the 500MHz 21264 Alpha workstation. Running
times were measures as average over multiple runs using the larger ref inputs.
95
Benchmark % speedup of Split over Base % speedup of Split over HPG
175.vpr 5 1
186.crafty -2 2
197.parser 2 3
256.bzip2 0 3
300.twolf 3 -2
181.mcf 12 4
164.gzip 3 2
average 4 2
Table 6.1: Percentage speedup in the running times using Split in comparison
to Base and HPG.
Benchmark Split / HPG
175.vpr 1.15
186.crafty 1.10
197.parser 1.27
256.bzip2 1.11
300.twolf 0.93
181.mcf 13.75
164.gzip 2.32
average 3.5
Table 6.2: Ratio of the number of dynamic instructions with constant uses
in Split over HPG.
Table 6.1 shows the speedup obtained by our Split approach over the
Base approach and over the HPG approach. Split gives an average speedup
of 4% over the Base case, and it gives an average speedup of 2% over the
HPG approach.
To understand where the speedup comes from, we calculate the number
of dynamic instructions which have constant uses indented by the restruc-
turing transformation. This is computed by rst performing constant propa-
gation and replacing all constant uses in the original program. Restructuring
(HPG or Split) is then carried out. The constant uses discovered can be at-
tributed only to the restructuring. Thus, each instruction is weighted by the
product of its execution count (using the ref inputs)and the number of new
constant uses. The sum over all instructions gives us the number of dynamic
instructions which have constant uses only because of restructuring. This
metric has also been used in [2]. Table 6.2 shows the ratio of these instruc-
96
Benchmark Split / Base Split / HPG
175.vpr 1.5 0.7
186.crafty 2.0 0.7
197.parser 1.8 1.1
256.bzip2 1.9 0.5
300.twolf 2.0 0.7
181.mcf 1.9 1.0
164.gzip 1.5 0.8
average 1.8 0.65
Table 6.3: Ratio of code size increase of Split over Base, and of Split over
HPG.
tions for Split over than of HPG. We observe an average of 3.5 times more
dynamic instructions with constants uses in Split as compared to HPG. In
the 181.mcf Split results in as many as 13.75 times dynamic constant use
instructions. This is because Split can handle cyclic structures eectively.
6.1.2 Cost of Split Approach
As mentioned earlier, the increase in precision comes at the cost of code
duplication. We measured the code size in terms of the number of Scale
intermediate instructions. Table 6.3 shows the ratio of the code size of Split
over that of Base. We observe an average of 1.8 (80%) increase due to
Split. The Table also shows the ratio of code size of Split over that of HPG.
We notice that Split incurrs less code size increase in comparison to HPG.
Split shows an average of 0.65 (35%) decrease in code size as compared to
HPG.
6.2 Backward Analysis
We present experimental results for the specic problem of Liveness Analy-
sis [1]. The experimental methodology used in the same as that for Constant
Propagation. As before, we compare our approach, Split, with the baseline
Base.
6.2.1 Benets of Split Approach
We measure the benets of our approach in terms of the percentage speedup
obtained in comparison with Base.
97
Benchmark % speedup of Split over Base
175.vpr 2
186.crafty 0
197.parser 1
256.bzip2 0
300.twolf 1
181.mcf 0
164.gzip 2
average 0.8
Table 6.4: Percentage speedup in the running times using Split in comparison
to Base.
Benchmark Split / Base
175.vpr 1.1
186.crafty 1.2
197.parser 1.1
256.bzip2 1.1
300.twolf 1.3
181.mcf 1.15
164.gzip 1.2
average 1.15
Table 6.5: Ratio of code size increase of Split over Base.
Table 6.4 shows the percentage speedup obtained. Improving the pre-
cision of liveness analysis causes code to become dead. We noticed that a
majority of the code which became dead was not in loops. This is reected
in the lack up speedup we get for most benchmarks such as 186.crafty. On
average, we get a average percentage speedup of 0.8%.
6.2.2 Cost of Split Approach
As before we measure the cost of the Split approach in terms of the increase
in code duplication.
Table 6.5 shows the increase in code due to our code duplication. Due to
the nature of the backward algorithm, the size of the Region of Inuence is
comparitively small. This is evidenced in the comparitively small increase in
code size. On an average we see a 15% increase in code size.
98
Chapter 7
Conclusions and Future
Directions
A conclusion is the place where you get tired of
thinking.
Arthur McBride Bloch
In this day, a man who says that something
cannot be done, is apt to be interrupted by some
idiot doing it.
Elbert Hubbard
We proposed a general framework to improve data-ow analysis preci-
sion based on restructuring the CFG of the program. The framework can
be instantiated to any data-ow analysis. The actual transformation uses a
known concepts of product automaton. We have proved that the transforma-
tion guarantees increase in optimization opportunities. Further, we showed
that getting the optimal restructuring is NP-hard and proposed and eval-
uated a greedy heuristic. Our results indicate that our approach performs
better than existing path prole driven approach [2].
7.1 Future Directions
7.1.1 Interprocedural Analysis
Our technique is currently restricted to be intra-procedural. It would be
worthwhile to explore an inter-procedural extension to our approach along
the lines of [14, 24], but which retains the simplicity and guarantees of our
approach.
99
Figure 7.1: Figure illustrating a demand-driven version of our approach.
7.1.2 Points-to Analysis
Though as is our technique is applicable to points-to analysis, there are some
interesting issues which arise when trying to make points-to analysis path
sensitive. Specically, the problem of estimating the benets of improving
precision at a program point for a given variable is not easy, since the benet
depends a lot on the client of the analysis.
7.1.3 Demand-driven Analysis
Currently, we target destructive merges where information is lost. We could
also be given program statements at which we want better precision and
carry out the restructuring in such a demand-driven fashion. For example,
following the use-def chains of the variables at a statement we can trace
back to the particular destructive merge which was responsible for the loss
in precision. Further, the only inuenced node for this destructive merge
would be the one provided. The rest follows as described.
Example. Figure 7.1 illustrates this concept. Suppose we are told to op-
timize node H : y = a;. By following the use-def chains from H G D,
we realise that the destructive merge D needs to be targeted. Setting the
100
Figure 7.2: One possible way of handling restructuring for multiple analyses.
Figure 7.3: A composite approach to handle multiple analyses.
inuenced nodes of D to the single node H we can carry out our restruc-
turing to optimize H. Note that even though node J is also inuenced by
destructive merge D we do not include it in the inuenced nodes.
2
7.1.4 Combined Restructuring for Multiple Analysis
Our approach can also be extended to handle multiple data-ow analysis
in the same pass. For example, restructuring the control-ow graph only
once to improve precision of constant propagation and availability analysis
as shown in Figure 7.3, as opposed to a separate pass for each analysis as
shown in Figure 7.2.
To achieve this we have to dene the notion of a composite destructive
merge taking into consideration both the analyses. This can be done in the
following two ways.
Denition 49. (And-Composite Destructive Merge) Given two analy-
ses A
1
and A
2
, a control-ow merge m is said to be a and-composite destruc-
tive merge if it is a destructive merge in A
1
and in A
2
.
Alternatively, we could dene a composite destructive merge as:
Denition 50. (Or-Composite Destructive Merge) Given two analy-
ses A
1
and A
2
, a control-ow merge m is said to be a or-composite destructive
merge if it is a destructive merge in A
1
or in A
2
.
101
Figure 7.4: Program P2 used to illustrate composite destructive merge for
constant propagation and availability analyses.
Figure 7.5: Program P1 used to illustrate composite destructive merge for
constant propagation and availability analyses.
102
Example. Consider constant propagation and availability analyses. In Fig-
ure 7.4, merge D is not an and-composite destructive merge, since it is not
a destructive merge for availability analysis. But in Figure 7.5, merge D is
an and-composite destructive merge.
Note that in both the gures, merge D is an or-composite destructive
merge.
2
In a similar vain, we need to dene the notion of an inuenced node in
this composite analysis.
Denition 51. (And-Influenced Node) Given composite destructive
merge m, a node n is said to an and-inuenced node if it is an inuenced
node according to analysis A
1
and analysis A
2
.
Denition 52. (Or-Influenced Node) Given combined destructive merge
m, a node n is said to an or-inuenced node if it is an inuenced node ac-
cording to analysis A
1
or analysis A
2
.
Having dened composite destructive merges and inuenced nodes, we
can perform the restructuring as described earlier by constructing a split
automaton and performing a product. If we consider And-Composite De-
structive merges and and-inuenced nodes, then all the inuenced nodes will
be optimized in the restructured program. This can be seen as a generalisa-
tion of the Ecacy Theorem to this composite restructuring.
It is not clear as to whether such a combined restructuring will always
give the same optimized program as the one obtained by performing separate
passes. This is because the increase in precision of one analysis might result in
dierent restructuring for the other other. More experimental and analytical
evaluation needs to be carried out to better understand this.
7.2 Conclusions
In this thesis, we studied the problem of improving the precision of data-ow
analysis. The approach we took was to restructure the control-ow graph
of the program. Having studied the related literature, we found no uniform
approach to tackle all kinds of data-ow analysis used commonly in compiler
optimizations. We believe that the framework developed in this thesis con-
forms to the simplicity and generality constraints imposed by us. We have
shown how to apply this to any forward and backward analysis. Specically
we illustrated our approach using constant propagation and liveness analysis.
103
Our claim of simplicity stems from the machinery we used to tackle this
challenging problem. By using key ideas from automata theory, we could eas-
ily express our approach and also rigorously prove useful properties. Speci-
cally, we used the product automaton to compute the restructured graph. We
have methodically proved that our restructuring transformation is correct in
that it preserves the semantics of the original program, and it is guaranteed
to increase optimization opportunities in the restructured program.
Interestingly, having developed a framework for forward analysis, extend-
ing it to backward analysis posed interesting challenges. The approach de-
scribed minimally changes the basic approach applicable to forward analysis
and only appends the necessary changes to the approach.
The most dicult problem was to control the code explosion caused as a
result of the code duplication. The major result here was to prove that nding
the optimal restructured control-ow graph is NP-Hard. Furthermore, the
proof also showed that the problem could not be simplied in order to make it
tractable. This is the reason we developed a greedy algorithm. The algorithm
makes use of prole information, though it does not completely rely on it.
Finally, we have implemented the framework into the Scale research com-
piler. Our results show promise, but as described in Section 7.1, there are a
lot of interesting problems yet to be solved.
104
Bibliography
[1] A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: Principles, Tech-
niques and Tools. Addison Wesley, 1986.
[2] Glenn Ammons and James R. Larus. Improving data-ow analysis with
path proles. In PLDI, pages 7284, 1998.
[3] Thomas Ball and James R. Larus. Ecient path proling. In Interna-
tional Symposium on Microarchitecture, pages 4657, 1996.
[4] Rastislav Bodk and Sadun Anik. Path-sensitive value-ow analysis.
In ACM SIGPLAN-SIGACT Symposium on Principles of Programming
Languages (POPL), pages 237251, 1998.
[5] Rastislav Bodik and Rajiv Gupta. Partial dead code elimination using
slicing transformations. In ACM SIGPLAN Conference on Programming
Language Design and Implementation (PLDI), 1997.
[6] Rastislav Bodk, Rajiv Gupta, and Mary Lou Soa. Rening data ow
information using infeasible paths. In M. Jazayeri and H. Schauer, ed-
itors, Proceedings of the Sixth European Software Engineering Confer-
ence (ESEC/FSE 97), pages 361377. SpringerVerlag, 1997.
[7] Rastislav Bodik, Rajiv Gupta, and Mary Lou Soa. Complete removal of
redundant expressions. In ACM SIGPLAN Conference on Programming
Language Design and Implementation (PLDI), pages 114, 1998.
[8] Manuvir Das, Sorin Lerner, and Mark Seigle. Esp: Path-sensitive pro-
gram verication in polynomial time. In ACM SIGPLAN Conference on
Programming Language Design and Implementation, pages 5768, 2002.
[9] Jerey Fischer, Ranjit Jhala, and Rupak Mujumdar. Joining data ow
with predicates. In Foundations of Software Engineering, pages 227236,
2005.
105
[10] S. Guyer and C. Lin. Client-driven pointer analysis. In International
Static Analysis Symposium, 2003.
[11] Matthew S. Hecht. Flow Analysis of Computer Programs. Elsevier Sci-
ence Inc., New York, NY, USA, 1977.
[12] L. Howard Holley and Barry K. Rosen. Qualied data ow problems.
IEEE Transactions on Software Engineering (TSE), 7(1):6078, 1981.
[13] Dexter C. Kozen. Automata and Computability. Springer-Verlag New
York, Inc., Secaucus, NJ, USA, 1997.
[14] David Melski and Thomas Reps. The interprocedural express-lane trans-
formation. In Compiler Construction, 2003.
[15] Steven S. Muchnick. Advanced Compiler Design and Implementation.
Morgan-Kaufmann, San Fransisco, CA, 1997.
[16] Markus M uller-Olm and Oliver R uthing. On the complexity of constant
propagation. In ESOP 01, pages 190205, London, UK, 2001. Springer-
Verlag.
[17] Christos H. Papadimitriou. Computational Complexity. Addison Wesley,
1994.
[18] T. A. Proebsting. Proebstings Law: Compiler Advances Double Com-
puting Power Every 18 Years. https://research.microsoft.com/ tod-
dpro/papers/law.htm. 1998.
[19] W. W. Pugh. Is Code Optimization (Research) Relevant?.
http://www.cs.umd.edu/ pugh/IsCodeOptimizationRelevant.pdf.
[20] John H. Reif and Harry R. Lewis. Symbolic evaluation and the global
value graph. In POPL 77, pages 104118, New York, NY, USA, 1977.
ACM Press.
[21] Scale. A scalable compiler for analytical experiments. www-
ali.cs.umass.edu/Scale/, 2006.
[22] SPEC. Standard Performance Evaluation Corporation.
http://www.spec.org.
[23] Bernhard Steen. Property-oriented expansion. In Third Static Analysis
Symposium, pages 2241, 1996.
106
[24] S. Triantafyllis, M. J. Bridges, E. Raman, G. Ottoni, and D. I. August.
A framework for unrestricted whole-program optimization. In PLDI,
June 2006.
[25] Mark N. Wegman and F. Kenneth Zadeck. Constant propagation with
conditional branches. In ACM Transactions on Programming Languages
and Systems, pages 181210, 1981.
107
Index
Additive Property, 80
And-Composite Destructive Merge, 101
And-Inuenced Node, 103
Anticipatability Analysis, 16
Assume Statements, 12
CFG, see Control Flow Graph
CFG Equivalence, see Equivalence, of
Control Flow Graphs
Constant propagation, 15
Control Flow Automaton, 22
Control Flow Graph, 9
Control Flow Paths, 10
Data-ow Analysis, 13
Data-ow Analysis Framework, 14
Data-ow analysis framework, 14
Data-ow Solution, 14
Destroyed Data-ow Facts, 16
Destroyed Data-ow Facts, Backward
Analysis, 18
Destructive Merge, 16
Destructive Merge, Backward Analy-
sis, 17
Deterministic Finite Automaton, 19
Edge Predicate, 12
Ecacy Theorem
for a single Destructive Merge, 49
for Multiple Destructive Merges,
54
Equivalence Theorem
for a single Destructive Merge, 49
for multiple Destructive Merges,
51
Equivalence, of Control Flow Graphs,
22
Fixed Point Solution, 15
Generalized Post-dominator Set, 10
Hoist Region, 62
in, 14
in dff, 14
Independent Destructive Merges, 80
Additive Property, 80
Sucient Condition, 84
Inuence Theorem, 35
Inuenced Nodes
for a single Destructive Merge, 34
for multiple Destructive Merges,
38
Inuenced Nodes, Realisable, 63
for multiple Destructive Merges,
64
influenced nodes, see Inuenced Nodes
Intersection Lemma, 20
Kill Edges
Backward Analysis, 65
Forward Analysis, 40
KNAPSACK problem, 85
Language of Finite Automaton, 19
Liveness Analysis, 15
Maximum Fixed Point Solution, 15
MFP, see Maximum Fixed Point So-
lution
108
Monotonicity Lemma, 53
Nave solution, 26
NP-Hard, 79
Or-Composite Destructive Merge, 101
Or-Inuenced Node, 103
out, 14
out dff, 14
PCP, see Post Correspondence Prob-
lem
Post Correspondence Problem, 36
Post-dominance Lemma, 41
pred, 10
Product Automaton, 19
Product Automaton, Generalized, 20
Reachable Nodes, 10
Reachable Nodes, Backward, 10
Region of Inuence
for a single Destructive Merge, 37
for multiple Destructive Merges,
38, 64
Reverse Graph, 10
Revival Data-ow Facts, 34
Revival Edges
Backward Analysis, 65
Forward Analysis, 40
revival dff, see Revival Data-ow
Facts
RoI, see Region of Inuence
Simple Graph, 9
Split Automaton, 43
Split Automaton, Backward, 67
Split Graph
for a single Destructive Merge, 45
for multiple Destructive Merges,
51, 78
Split Graph, Backward
for a single Destructive Merge, 68
SPLIT problem, 85
Split Region, 63
Split Theorem, 85
State Predicate, 67
Subsumption Lemma, 47
succ, 10
Useful Data-ow Facts, 33
useful dff, see Useful Data-ow Facts
109
A Control-Flow Automaton is a finite-state automaton derived from a Control Flow Graph (CFG), treating CFG nodes as states, edges as the alphabet, and start and end nodes as start and accepting states, respectively . This automaton representation allows CFGs to be analyzed and optimized using concepts from automata theory, such as language acceptance and product automata . In CFG optimization, especially during transformations to handle data-flow merges, Control-Flow Automata are used in the construction of Split Graphs, which combine the original CFG with a Split Automaton to optimize and restructure the CFG without altering semantics . The equivalence of the Split Graph with the original CFG is maintained, thus allowing optimizations without losing the correctness of program execution . The automata viewpoint aids in simplifying complex control flows and ensures optimization opportunities, as shown in the transformation processes .
The Naïve approach to duplicate destructive merges in control-flow graphs (CFGs) faces significant challenges, such as increased code size and inefficiency. This approach duplicates nodes with destructive merges, which can shift the merge problem to other parts of the CFG without necessarily eliminating the inefficiency . The process must be repeated until there are no more destructive merges, causing an unnecessary increase in code size and computational overhead . Additionally, every duplication requires data-flow analysis, further exacerbating the inefficiency . Advanced solutions address these challenges by transforming the CFG in a way that improves data-flow precision without excessive duplication. These solutions perform a single data-flow analysis and transform the CFG once, thereby ensuring improved data-flow precision while minimizing unnecessary code duplication . Further, heuristics and frameworks, such as those implemented in the Scale compiler, have been developed to optimize this restructuring process. These frameworks often use a greedy algorithm to effectively manage the trade-off between precision and code size . They improve efficiency by eliminating only the most "fit" destructive merges, thus targeting the most impactful regions of the code for optimization .
Transforming a Control Flow Graph (CFG) to handle backward data-flow analysis involves reversing the direction of data-flow information to focus on control-flow forks where approximations are made. The method starts by reversing the input CFG, and computing a product with a corresponding Split Automaton, ensuring transitions preserve and restore data-flow information at various points. A key challenge in this transformation is ensuring that assume statements are correctly placed. Assume statements indicate specific conditions allowing edges in the graph to maintain proper data-flow semantics, necessary to enable accurate backward data-flow analysis like liveness analysis. The transformation needs to balance preserving data-flow precision while accounting for increased complexity in the CFG structure, often requiring intricate product computations and appropriate placement of predicates matching the control-flow conditions .
The extension from forward to backward data-flow analysis impacts the management of control flow in control flow graphs (CFGs) by altering the direction of information flow. In backward analysis, the flow of data is against the control flow, instead of along it, causing modifications such as targeting control-flow forks instead of joins, which are now treated as destructive merges where precision loss occurs . This process involves restructuring the CFG to handle the different conditions presented by backward analysis, necessitating duplication of nodes to eliminate destructive merges that occur when data-flow information is wrongly merged at forks . Consequently, this restructuring leads to an increase in optimization opportunities by improving data-flow precision, though it inherently increases code size due to node duplication . The overall goal is to perform data-flow analysis on the restructured graph, which provides more precise results than on the original CFG .
Revival and Kill Transitions in Split Graph computation for backward data-flow analysis are crucial for maintaining the precision and correctness of the analysis. Revival Transitions allow the Split Automaton to move to the pertinent state representing a specific data-flow fact when an edge belonging to an equivalence class Ri is encountered, thus enabling backward analysis to track the flow of data precisely . Kill Transitions, on the other hand, reset the automaton back to the initial state when a Kill edge is encountered, effectively merging paths when it is deemed unnecessary to keep them separate and no information loss will occur . This prevents node duplication in the Control Flow Graph (CFG), thereby controlling code explosion and ensuring the transformation meets both efficacy and equivalence constraints . By integrating predicates with these transitions, the backward analysis further optimizes the CFG paths and data-flow precision ."}
In constructing Split Graphs for backward data-flow analysis, the Split Automaton is used in a manner that involves working on the reverse graph of the control-flow graph (CFG) rather than the CFG itself, as is done in forward analysis . The Split Automaton for backward analysis requires appending an assume statement on each transition, leveraging the state predicates, which are annotated onto the transitions of the Split Automaton . In contrast, forward analysis constructs the Split Graph by directly computing the product of the CFG and the Split Automaton without reversing the graph . The key difference arises from the need to handle the reverse control flow in backward analysis to ensure that data-flow facts do not merge destructively, which requires altering the handling of transitions and predicates accordingly ."}
Transforming a Control Flow Graph (CFG) improves data-flow precision by restructuring the graph to eliminate destructive merges that cause loss of precision in traditional data-flow analysis. This transformation leads to more precise data-flow analysis results and increases optimization opportunities without altering the program's semantics . A specific example used to illustrate this improvement is the constant propagation problem in a simple acyclic program where a destructive merge at node D, which merges data-flow facts {x = 1} and {x = 2} to {x = nc}, inhibits precise replacement of the variable x at node G by a constant . By duplicating the destructive merge, the CFG is restructured, improving data-flow precision and allowing for better optimization opportunities .
Revival Data-flow facts refer to data that can be resurrected or become relevant again in backward data-flow analysis after seemingly being destroyed by previous operations or merges. In backward analysis, such facts are critical because they indicate points in the control-flow graph where previously annihilated data can influence the analysis positively, thus providing more opportunities for optimization . These data-flow facts are essential in maintaining the precision of liveness analysis and other backward analyses by accounting for variably revived states across multiple paths in the control-flow graph . Their significance lies in mitigating the effects of destructive merges, where different data-flow facts are combined or approximated, typically resulting in a loss of precision. By identifying and potentially restoring these facts, more precise and effective data-flow analysis outcomes can be achieved, leading to better optimization potential ."}
'Useful data-flow facts' are those data-flow facts that, when true at a given node, allow for optimization of that node. They are formally defined in forward data-flow analysis as follows: A data-flow fact is said to be a Useful Data-flow Fact for a given node n, if and only if the fact holding true at node n implies that the node can be optimized . Additionally, there is a generalized definition where a data-flow fact is considered useful for node n at node m if it allows optimization of node n when true at node m .
Influenced nodes in the context of eliminating destructive merges refer to nodes in a Control Flow Graph (CFG) that are affected by a specific destructive merge, impacting data-flow precision. They are identified as the nodes where the data-flow fact is influenced by the merge, particularly where paths from the merge node that do not redefine specific variables influence computation . Specifically, for a destructive merge m, influenced nodes are those whose computations depend on data initialized before and influenced by m but not redefined afterwards . In practice, the set of influenced nodes is often an under-approximation, omitting some nodes actually affected due to the complexity of exact identification .