Software Engineering
Testing Basics
CUHK Shenzhen
Pinjia He
1. What is Software Testing
2. Why Software Testing
3. Stages and Categories of Testing
4. Coverage Metrics
Pinjia He @ CUHK Shenzhen
What is Software Testing
Software testing is the process of evaluating and
verifying that a software product or application does
what it is supposed to do.
—— IBM
[Link]
Find as many bugs as possible.
Pinjia He @ CUHK Shenzhen
The Patriot Accident
• The Patriot missile air defense system
tracks and intercepts incoming missiles
• On February 25, 1991, a Patriot system
ignored an incoming Scud missile
• 28 soldiers died; 98 were injured
Pinjia He @ CUHK Shenzhen
History of Software Testing
• Debugging era
• Demonstration era
• Destruction era
• Evaluation era
• Prevention era
Pinjia He @ CUHK Shenzhen
Testing and Analysis
[Link]
Pinjia He @ CUHK Shenzhen
Testing Stages
Pinjia He @ CUHK Shenzhen
Creation of Test Harness
• Test driver
✓Apply tests to UUT, including setup & clean-up
• Test stub
✓Partial, temporary implementation of a component used
by UUT
✓Mock a missing component back fake data
Pinjia He @ CUHK Shenzhen
Unit Testing
• Testing individual units
• Goal: Confirm units are correct & meet intended
functionalities
Pinjia He @ CUHK Shenzhen
Unit Testing Example
Pinjia He @ CUHK Shenzhen
Test Execution
• Execute the tests
Pinjia He @ CUHK Shenzhen
Eight Rules of Testing [M. Fowler]
1. Make sure all tests are 5. Better to write and run
fully automatic and incomplete tests than not
check their own results run complete tests
2. A test suite is a powerful 6. Concentrate your tests on
bug detector that boundary conditions
reduces the time to find 7. Don’t forget to test raised
bugs exceptions when things
3. Run tests frequently – are expected to go wrong
every test at least once a 8. Do not let the fear that
day testing can’t catch all
4. When you get a bug bugs stop you from
report, start by writing a writing tests that will
unit test to expose the catch most bugs
bug
Pinjia He @ CUHK Shenzhen
Testing Stages
Pinjia He @ CUHK Shenzhen
Integration Testing
• Test groups of subsystem & eventually the entire
system
• Goal: Test interfaces between subsystems
Pinjia He @ CUHK Shenzhen
System Testing
• Test the entire system
• Goal: Determine if the system meets the requirements, both
functional and non-functional
Pinjia He @ CUHK Shenzhen
System Testing Stages
Pinjia He @ CUHK Shenzhen
Functional Testing
• Goal: Test functionality of system
System is treated as a black box
⚫ Test cases are design from requirements
Based on use cases
Alternative source: user manual
⚫ Test cases describe
Input data
Flow of events
Results to check
Pinjia He @ CUHK Shenzhen
Acceptance Testing
• Goal: Demonstrate system meets customer requirements
• Performed by the client, not by the developer
• Alpha test
Client uses the software at the developer’s site
Software used in controlled setting --- developers ready to fix bugs
⚫ Beta test
Conducted at client’s site (developer not present)
Software gets a realistic workout in target environments
Pinjia He @ CUHK Shenzhen
Independent Testing
• Hard for programmers to accept they made a mistake
Plus a vested interest in not finding mistakes
Often stick to the data that makes the program work
⚫ Designing and programming are constructive tasks
Testers must seek to break the software
⚫ Testing is done best by independent testers
Pinjia He @ CUHK Shenzhen
Testing Steps
Pinjia He @ CUHK Shenzhen
Testing Strategies
• Exhaustive testing: Check UUT for all possible inputs
Not feasible, even for trivial programs
⚫ Random testing: Select test data randomly & uniformly
⚫ Functional testing: Use requirements to decide test cases
⚫ Structural testing: Use design knowledge about system
structure, algorithms, data structures to determine test
cases that exercise a large portion of the code
Pinjia He @ CUHK Shenzhen
Testing Strategies
Pinjia He @ CUHK Shenzhen
Testing Steps
Pinjia He @ CUHK Shenzhen
Finding representative inputs
Pinjia He @ CUHK Shenzhen
Finding representative inputs
• Divide inputs into
equivalence classes
Each possible input belongs
to one equivalence class
Goal: some classes have
higher failure density
⚫ Choose tests for each
equivalence class
Pinjia He @ CUHK Shenzhen
Finding representative inputs
• Divide inputs into
equivalence classes
Each possible input belongs
to one equivalence class
Goal: some classes have
higher failure density
⚫ Choose tests for each
equivalence class
Pinjia He @ CUHK Shenzhen
Selecting representative values
• Once partitioned the inputs, need to select concrete values
for the tests for each equivalence class
• Input from a range of valid values
Below, within, and above the range
• Input from a discrete set of values
Valid and invalid discrete values
Instances of each subclass
Pinjia He @ CUHK Shenzhen
Boundary Testing
• Many errors occur at boundaries of the input domain
Overflows
Comparisons (‘<’ instead of ‘<=’, etc.)
Missing emptiness checks (e.g., collections)
Wrong number of iterations
Pinjia He @ CUHK Shenzhen
Combinatorial Testing
• Many input values: equiv. classes + boundary testing
• Testing all combinations leads to combinatorial explosion
• Reduce test cases to make effort feasible
Random selection
Semantic constraints
Combinatorial selection
Pinjia He @ CUHK Shenzhen
Semantic Constraints
Pinjia He @ CUHK Shenzhen
Pairwise Combinatorial Testing
• Focus on possible combinations of pairs of inputs
• Example: Consider a method with 4 Boolean parameters
Combinatorial testing requires 24 = 16 test cases
Pairwise-combinations testing requires 5 test cases:
TTTT, TFFF, FTFF, FFTF, FFFT
⚫ Can be generalized to k-tuples (k-way testing)
⚫ For n parameters with d values per parameter, the number
of test cases grow logarithmically in n and quadratic in d
⚫ Results hold for large n & d, and for all k in k-way testing
Pinjia He @ CUHK Shenzhen
Functional Testing: Summary
Pinjia He @ CUHK Shenzhen
1. What is Software Testing
2. Why Software Testing
3. Stages and Categories of Testing
4. Coverage Metrics
Pinjia He @ CUHK Shenzhen
Structural Testing
• White-box test a unit to cover a large portion of its code
Pinjia He @ CUHK Shenzhen
Basic Blocks
• Basic block: a sequence of statements having
- One entry point: no code within is a jump target
- One exit point: only the last instruction exits the block
- If the first instruction is executed, the rest are also executed
Pinjia He @ CUHK Shenzhen
Basic Blocks: Example
Pinjia He @ CUHK Shenzhen
Intraprocedural Control Flow Graphs
• An intraprocedural CFG of a proc. p is a graph (N, E) where
- Nodes
⚫ Basic blocks in p
⚫ Designated entry and exit blocks
- Edges
⚫ (a, b): an edge from a to b if there is direct control flow from a to b
⚫ (entry, a): a is the first basic block of p
⚫ (b, exit): for each b ending with a (possibly implicit) return statement
Pinjia He @ CUHK Shenzhen
Control Flow Graphs: Example
Pinjia He @ CUHK Shenzhen
Test Coverage
• The CFG serves as an
adequacy criterion for tests
• The more parts executed, the
higher chance to find bugs
• “parts” can be
- Nodes
- Edges
- Paths
- etc.
Pinjia He @ CUHK Shenzhen
Test Coverage
• Consider input a = {3, 7, 5}
Pinjia He @ CUHK Shenzhen
Test Coverage
• Consider input a = {3, 7, 5}
Pinjia He @ CUHK Shenzhen
Statement Coverage
• Judge a test suite’s quality by how much the CFG is covered
• Idea: Can detect a bug in a statement only by executing it
- Can also be defined on basic blocks
Pinjia He @ CUHK Shenzhen
Statement Coverage: Example
• Consider input a = {3, 7, 5}
• It executes 7/10 basic blocks
• Statement coverage: 70%
• How to achieve 100%
statement coverage?
Pinjia He @ CUHK Shenzhen
Statement Coverage: Example (cont.)
• We can achieve 100%
statement coverage with
three test cases
- a={1}
- a = { 5, 7 }
- a = { 7, 5 }
• The last detects the bug
Pinjia He @ CUHK Shenzhen
Statement Coverage: Example (cont.)
• We can achieve 100%
statement coverage with
three test cases
- a={1}
- a = { 5, 7 }
- a = { 7, 5 }
• The last detects the bug
Pinjia He @ CUHK Shenzhen
Statement Coverage: Example (cont.)
• We can achieve 100%
statement coverage with
three test cases
- a={1}
- a = { 5, 7 }
- a = { 7, 5 }
• The last detects the bug
Pinjia He @ CUHK Shenzhen
Statement Coverage: Discussion
Pinjia He @ CUHK Shenzhen
Statement Coverage: Discussion
• Achieve 100% statement
coverage with 2 tests
- a = null
- a = { 1, 2 }, x = 2
Pinjia He @ CUHK Shenzhen
Statement Coverage: Discussion
• Achieve 100% statement
coverage with 2 tests
- a = null
- a = { 1, 2 }, x = 2
• The tests do not detect the
bug
• More thorough testing is
needed
Pinjia He @ CUHK Shenzhen
Branch Coverage
• Idea: test all possible branches in the control flow
• Edge (m,n) is a branch iff there is edge (m, n’) with 𝑛 ≠ 𝑛′
• Branch coverage is 100% if code has no branches
• Most widely-used adequacy criterion in industry
• Branch coverage is more thorough than statement coverage
- Q: 100% branch coverage => 100% statement coverage ?
- Q: “>=n% branch cov.” => “>=n% statement cov.” ?
Pinjia He @ CUHK Shenzhen
Branch Coverage: Example 1
• Consider input a = {3, 7, 5}
• It executes 4/8 branches
• Branch coverage 50%
• What tests to include for 100%
branch coverage
Pinjia He @ CUHK Shenzhen
Branch Coverage: Example 2
• The two tests:
- a = null
- a = {1,2}, x = 2
execute 5/6 branches
• Branch coverage: 83%
• How to achieve 100%
branch coverage?
Pinjia He @ CUHK Shenzhen
Branch Coverage: Example 2
• To achieve 100% branch
coverage requires a test
that runs the loop to the
end
- a = null
- a = {1}, x = 1
- a = {1}, x = 3
• The last one detects the
bug
Pinjia He @ CUHK Shenzhen
Branch Coverage: Discussion
Pinjia He @ CUHK Shenzhen
Branch Coverage: Discussion
• Can obtain 100% branch coverage
with 1 test
- a = {1}
• The test doesn’t detect the bug
• More tests are needed
Pinjia He @ CUHK Shenzhen
Branch Coverage: Discussion
Pinjia He @ CUHK Shenzhen
Branch Coverage: Discussion
• We can achieve 100% branch
coverage with two tests
- a = true, b = false
- a = false, b = true
• The tests do not detect the bug
• More thorough testing is needed
Pinjia He @ CUHK Shenzhen
Path Coverage
• Idea: test all possible paths through the CFG
• A path is a sequence of nodes 𝑛1 , … 𝑛𝑘 such that
- 𝑛1 = 𝑒𝑛𝑡𝑟𝑦
- 𝑛𝑘 = 𝑒𝑥𝑖𝑡
- There is an edge (𝑛𝑖 , 𝑛𝑖+1 ) in the CFG
• Path coverage is more thorough than statement and branch coverage
- 100% path coverage => 100% complete statement and branch coverage
- But, “>=n% path coverage” does not generally imply “>=n% statement coverage” or
“>=n% branch coverage”
• Complete path coverage is not feasible for input-dependent loops
- Unbounded number of paths
Pinjia He @ CUHK Shenzhen
Path Coverage: Example 1
• The two tests
- a = true, b = false
- a = false, b = true
execute 2/4 paths
• Path coverage: 50%
Pinjia He @ CUHK Shenzhen
Path Coverage: Example 1
• We can achieve 100% path
coverage with four tests
- a = true, b = false
- a = false, b = true
- a = true, b = true
- a = false, b = false
• The two new tests detect the
bug
Pinjia He @ CUHK Shenzhen
Path Coverage: Example 2
Pinjia He @ CUHK Shenzhen
Path Coverage: Example 2
• Number of loop iterations is
unknown statically
• Arbitrarily many tests are
needed for complete path
coverage
Pinjia He @ CUHK Shenzhen
Loop Coverage
• Idea: for each loop, test 0,1,>=2 (consecutive) iterations
Pinjia He @ CUHK Shenzhen
Loop Coverage: Example
• We can achieve 100% loop
coverage with 3 tests
- a = {}
- a = {1}
- a = {1,2}
• The last detects the bug
Pinjia He @ CUHK Shenzhen
Many Others
• Condition coverage
• Multiple condition coverage
• Decision coverage
• Condition/decision coverage
• Modified condition/decision coverage
…
Pinjia He @ CUHK Shenzhen
Coverage Metrics for DL
• How about deep neural networks?
Pinjia He @ CUHK Shenzhen
[SOSP17] Pei et al. “DeepXplore: Automated Whitebox Testing of Deep Learning Systems”
Coverage Metrics for DL
Pinjia He @ CUHK Shenzhen
[SOSP17] Pei et al. “DeepXplore: Automated Whitebox Testing of Deep Learning Systems”
Neuron Coverage
• How many neurons were activated
• N: all neurons
• T: test inputs
• out: function returns the output value of neuron
• t: threshold
Pinjia He @ CUHK Shenzhen
Many others
• K-multisection neuron coverage
• Neuron boundary coverage
• Strong neuron activation coverage
• Top-k neuron patterns
…
Pinjia He @ CUHK Shenzhen
Useful?
• [ESEC/FSE20] Is neuron coverage a meaningful measure for testing deep
neural networks?
• [ESEC/FSE20] Correlations between deep neural network model
coverage criteria and model quality
Pinjia He @ CUHK Shenzhen
Compiler Source Code
Machine Yue
CodeLi @ Nanjing University
Compiler Source Code
Lexical Analysis
Scanner
You ϡ goouojd
Tokens
Machine Yue
CodeLi @ Nanjing University
Compiler Source Code
Lexical Analysis
Scanner
You ϡ goouojd
Tokens
Parser Syntax Analysis
Like your hair I
AST
Machine Yue
CodeLi @ Nanjing University
Compiler Source Code
Lexical Analysis
Scanner
You ϡ goouojd
Tokens
Parser Syntax Analysis
Like your hair I
AST
Type Checker Semantic Analysis
Decorated Apples eat you
AST
Machine Yue
CodeLi @ Nanjing University
Compiler Source Code
Lexical Analysis
Scanner
You ϡ goouojd
Tokens
Parser Syntax Analysis
Like your hair I
AST
Type Checker Semantic Analysis
Decorated Apples eat you
AST
Translator
IR
Machine Yue
CodeLi @ Nanjing University
Compiler Source Code
Lexical Analysis
Scanner
You ϡ goouojd
Tokens
Parser Syntax Analysis
Like your hair I
AST
Type Checker Semantic Analysis
Decorated Apples eat you
AST
Translator
IR
Code Generator
Machine Yue
CodeLi @ Nanjing University
Compiler Source Code
Lexical Analysis
Scanner
You ϡ goouojd
Tokens
Parser Syntax Analysis
Like your hair I
AST
Type Checker Semantic Analysis
Decorated Apples eat you
AST
Translator
IR Static Analysis
Code Generator e.g., code optimization
Machine Yue
CodeLi @ Nanjing University
AST vs. IR
<
AST
Yue Li @ Nanjing University
AST vs. IR
<
AST IR
(“3-address” form)
Yue Li @ Nanjing University
AST vs. IR
• high-level and closed to grammar structure
AST • usually language dependent
• suitable for fast type checking
• lack of control flow information
• low-level and closed to machine code
• usually language independent
IR • compact and uniform
• contains control flow information
• usually considered as the basis for static analysis
<
AST IR
(“3-address” form)
Yue Li @ Nanjing University
Intermediate Representation (IR)
• 3-Address Code (3AC)
There is at most one operator on the right side of an instruction.
t1 = a + b
c=a+b+3
c = t1 + 3
Yue Li @ Nanjing University
Intermediate Representation (IR)
• 3-Address Code (3AC)
There is at most one operator on the right side of an instruction.
t1 = a + b
c=a+b+3
c = t1 + 3
Yue Li @ Nanjing University
Intermediate Representation (IR)
• 3-Address Code (3AC)
There is at most one operator on the right side of an instruction.
t1 = a + b
c=a+b+3
c = t1 + 3
Yue Li @ Nanjing University
Intermediate Representation (IR)
• 3-Address Code (3AC)
There is at most one operator on the right side of an instruction.
t1 = a + b
c=a+b+3
c = t1 + 3
Address can be one of the following:
• Name: a, b, c
• Constant: 3
• Compiler-generated temporary: t1
Yue Li @ Nanjing University
Intermediate Representation (IR)
• 3-Address Code (3AC)
There is at most one operator on the right side of an instruction.
t1 = a + b
c=a+b+3
c = t1 + 3
Address can be one of the following:
• Name: a, b, c
• Constant: 3
• Compiler-generated temporary: t1
Each type of instructions has its own 3AC form
Yue Li @ Nanjing University
Some Common 3AC Forms
• x = y bop z
x, y, z: addresses
• x = uop y
bop: binary arithmetic or logical operation
• x=y uop: unary operation (minus, negation, casting)
L: a label to represent a program location
• goto L rop: relational operator (>, <, ==, >=, <=, etc.)
goto L: unconditional jump
• if x goto L if … goto L: conditional jump
• if x rop y goto L
Yue Li @ Nanjing University
Some Common 3AC Forms
• x = y bop z
x, y, z: addresses
• x = uop y
bop: binary arithmetic or logical operation
• x=y uop: unary operation (minus, negation, casting)
L: a label to represent a program location
• goto L rop: relational operator (>, <, ==, >=, <=, etc.)
goto L: unconditional jump
• if x goto L if … goto L: conditional jump
• if x rop y goto L
Let’s see some more real-world complicated forms
Yue Li @ Nanjing University
Soot and Its IR: Jimple
• Soot
Most popular static analysis framework for Java
[Link]
[Link]
Soot’s IR is Jimple: typed 3-address code
Yue Li @ Nanjing University
Java Src
Do-While Loop
Yue Li @ Nanjing University
Java Src
Do-While Loop
Yue Li @ Nanjing University 3AC(jimple)
Java Src
Method Call
Yue Li @ Nanjing University
Java Src
Method Call
Yue Li @ Nanjing University 3AC(jimple)
Java Src
Method Call
Yue Li @ Nanjing University 3AC(jimple)
Class
Java Src
Yue Li @ Nanjing University
Class
Java Src
3AC(jimple)
Yue Li @ Nanjing University
Control Flow Analysis
• Usually refer to building Control Flow Graph (CFG)
Yue Li @ Nanjing University
Control Flow Analysis
• Usually refer to building Control Flow Graph (CFG)
Input: 3AC of P
Output: CFG of P
Yue Li @ Nanjing University
Control Flow Analysis
• Usually refer to building Control Flow Graph (CFG)
• CFG serves as the basic structure for static analysis
Input: 3AC of P
Output: CFG of P
Yue Li @ Nanjing University
Control Flow Analysis
• Usually refer to building Control Flow Graph (CFG)
• CFG serves as the basic structure for static analysis
• The node in CFG can be an individual 3-address
instruction, or (usually) a Basic Block (BB)
Input: 3AC of P
Output: CFG of P
Yue Li @ Nanjing University
Control Flow Analysis
• Usually refer to building Control Flow Graph (CFG)
• CFG serves as the basic structure for static analysis
• The node in CFG can be an individual 3-address
instruction, or (usually) a Basic Block (BB)
Input: 3AC of P
Output: CFG of P
Yue Li @ Nanjing University
Basic Blocks (BB)
• Basic blocks (BB) are maximal sequences of consecutive
three-address instructions with the properties that
a = q
b = x + a
c = 2a - b
if p == q goto B6
Yue Li @ Nanjing University
Basic Blocks (BB)
• Basic blocks (BB) are maximal sequences of consecutive
three-address instructions with the properties that
- It can be entered only at the beginning, i.e., the first
instruction in the block
a = q
b = x + a
c = 2a - b
if p == q goto B6
Yue Li @ Nanjing University
Basic Blocks (BB)
• Basic blocks (BB) are maximal sequences of consecutive
three-address instructions with the properties that
- It can be entered only at the beginning, i.e., the first
instruction in the block
- It can be exited only at the end, i.e., the last instruction
in the block
a = q
b = x + a
c = 2a - b
if p == q goto B6
Yue Li @ Nanjing University
(1) x = input
(2) y = x - 1
(3) z = x * y
(4) if z < x goto (7)
(5) p = x / y a = q
(6) q = p + y b = x + a
(7) a = q c = 2a - b
if p == q goto B6
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
(1) x = input
(2) y = x - 1
(3) z = x * y
(4) if z < x goto (7)
(5) p = x / y a = q
(6) q = p + y b = x + a
(7) a = q c = 2a - b
if p == q goto B6
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
(1) x = input
(2) y = x - 1
(3) z = x * y
(4) if z < x goto (7)
(5) p = x / y a = q
(6) q = p + y b = x + a
(7) a = q c = 2a - b
if p == q goto B6
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
(1) x = input
(2) y = x - 1
(3) z = x * y
(4) if z < x goto (7)
(5) p = x / y a = q
(6) q = p + y b = x + a
(7) a = q c = 2a - b
if p == q goto B6
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
(1) x = input
(2) y = x - 1
(3) z = x * y
(4) if z < x goto (7)
(5) p = x / y a = q
(6) q = p + y b = x + a
(7) a = q c = 2a - b
if p == q goto B6
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
(1) x = input
(2) y = x - 1
(3) z = x * y
(4) if z < x goto (7)
(5) p = x / y a = q
(6) q = p + y b = x + a
(7) a = q c = 2a - b
if p == q goto B6
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
(1) x = input
(2) y = x - 1
(3) z = x * y
(4) if z < x goto (7)
(5) p = x / y a = q
(6) q = p + y b = x + a
(7) a = q c = 2a - b
if p == q goto B6
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
(1) x = input
(2) y = x - 1
(3) z = x * y
(4) if z < x goto (7)
(5) p = x / y a = q
(6) q = p + y b = x + a
(7) a = q c = 2a - b
if p == q goto B6
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
How to build Basic Blocks?
INPUT: A sequence of three-address instructions of P
OUTPUT: A list of basic blocks of P
METHOD: (1) Determine the leaders in P
• The first instruction in P is a leader
• Any target instruction of a conditional or
unconditional jump is a leader
• Any instruction that immediately follows a
conditional or unconditional jump is a leader
(2) Build BBs for P
• A BB consists of a leader and all its subsequent
instructions until the next leader
Yue Li @ Nanjing University
How to build Basic Blocks?
INPUT: A sequence of three-address instructions of P
OUTPUT: A list of basic blocks of P
METHOD: (1) Determine the leaders in P
• The first instruction in P is a leader
• Any target instruction of a conditional or
unconditional jump is a leader
• Any instruction that immediately follows a
conditional or unconditional jump is a leader
(2) Build BBs for P
• A BB consists of a leader and all its subsequent
instructions until the next leader
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) x = input
(2) y = x - 1
(3) z = x * y
(4) if z < x goto (7)
(5) p = x / y
(6) q = p + y
(7) a = q
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) Determine the leaders in P
(1) x = input
(2) y = x - 1
(3) z = x * y
(4) if z < x goto (7)
(5) p = x / y
(6) q = p + y
(7) a = q
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) Determine the leaders in P
(1) x = input
• The first instruction in P is a leader
(2) y = x - 1
(3) z = x * y
(4) if z < x goto (7)
(5) p = x / y
(6) q = p + y
(7) a = q
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) Determine the leaders in P
(1) x = input
• The first instruction in P is a leader
(2) y = x - 1
(3) z = x * y
(4) if z < x goto (7)
(5) p = x / y
(6) q = p + y
(7) a = q
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) Determine the leaders in P
(1) x = input
• (1)
(2) y = x - 1
(3) z = x * y
(4) if z < x goto (7)
(5) p = x / y
(6) q = p + y
(7) a = q
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) Determine the leaders in P
(1) x = input
• (1)
(2) y = x - 1
(3) z = x * y • Any target instruction of a conditional
or unconditional jump is a leader
(4) if z < x goto (7)
(5) p = x / y
(6) q = p + y
(7) a = q
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) Determine the leaders in P
(1) x = input
• (1)
(2) y = x - 1
(3) z = x * y • Any target instruction of a conditional
or unconditional jump is a leader
(4) if z < x goto (7)
(5) p = x / y
(6) q = p + y
(7) a = q
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) Determine the leaders in P
(1) x = input
• (1)
(2) y = x - 1
(3) z = x * y • (3),(7),(12)
(4) if z < x goto (7)
(5) p = x / y
(6) q = p + y
(7) a = q
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) Determine the leaders in P
(1) x = input
• (1)
(2) y = x - 1
(3) z = x * y • (3),(7),(12)
(4) if z < x goto (7) • Any instruction that immediately
(5) p = x / y follows a conditional or unconditional
jump is a leader
(6) q = p + y
(7) a = q
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) Determine the leaders in P
(1) x = input
• (1)
(2) y = x - 1
(3) z = x * y • (3),(7),(12)
(4) if z < x goto (7) • Any instruction that immediately
(5) p = x / y follows a conditional or unconditional
jump is a leader
(6) q = p + y
(7) a = q
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) Determine the leaders in P
(1) x = input
• (1)
(2) y = x - 1
(3) z = x * y • (3),(7),(12)
(4) if z < x goto (7) • (5),(11),(12)
(5) p = x / y
(6) q = p + y
(7) a = q
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) Determine the leaders in P
(1) x = input
• (1) Leaders: (1), (3),
(2) y = x - 1
(3) z = x * y • (3),(7),(12) (5), (7), (11), (12)
(4) if z < x goto (7) • (5),(11),(12)
(5) p = x / y
(6) q = p + y
(7) a = q
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) Determine the leaders in P
(1) x = input
• (1) Leaders: (1), (3),
(2) y = x - 1
(3) z = x * y • (3),(7),(12) (5), (7), (11), (12)
(4) if z < x goto (7) • (5),(11),(12)
(5) p = x / y (2) Build BBs for P
(6) q = p + y • A BB consists of a leader and all
(7) a = q its subsequent instructions until
the next leader
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) Determine the leaders in P
(1) x = input
• (1) Leaders: (1), (3),
(2) y = x - 1
(3) z = x * y • (3),(7),(12) (5), (7), (11), (12)
(4) if z < x goto (7) • (5),(11),(12)
(5) p = x / y (2) Build BBs for P
(6) q = p + y • A BB consists of a leader and all
(7) a = q its subsequent instructions until
the next leader
(8) b = x + a
(9) c = 2a - b • B1 {(1)}
(10) if p == q goto (12) • B2 {(3)}
(11) goto (3) • B3 {(5)}
(12) return • B4 {(7)}
• B5 {(11)}
• B6 {(12)}
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) Determine the leaders in P
(1) x = input
• (1) Leaders: (1), (3),
(2) y = x - 1
(3) z = x * y • (3),(7),(12) (5), (7), (11), (12)
(4) if z < x goto (7) • (5),(11),(12)
(5) p = x / y (2) Build BBs for P
(6) q = p + y • A BB consists of a leader and all
(7) a = q its subsequent instructions until
the next leader
(8) b = x + a
(9) c = 2a - b • B1 {(1),(2)}
(10) if p == q goto (12) • B2 {(3),(4)}
(11) goto (3) • B3 {(5),(6)}
(12) return • B4 {(7),(8),(9),(10)}
• B5 {(11)}
• B6 {(12)}
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) x = input • B1 {(1),(2)}
(2) y = x - 1 • B2 {(3),(4)}
(3) z = x * y • B3 {(5),(6)}
(4) if z < x goto (7) • B4 {(7),(8),(9),(10)}
(5) p = x / y • B5 {(11)}
(6) q = p + y • B6 {(12)}
(7) a = q
(8) b = x + a
(9) c = 2a - b
(10) if p == q goto (12)
(11) goto (3)
(12) return
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) x = input (1) x = input
B1
(2) y = x - 1 (2) y = x - 1
(3) z = x * y (3) z = x * y
B2
(4) if z < x goto (7) (4) if z < x goto (7)
(5) p = x / y
(5) p = x / y
(6) q = p + y B3
(6) q = p + y
(7) a = q
(8) b = x + a (7) a = q
(9) c = 2a - b (8) b = x + a
B4
(10) if p == q goto (12) (9) c = 2a - b
(11) goto (3) (10) if p == q goto (12)
(12) return B5 (11) goto (3)
B6 (12) return
Yue Li @ Nanjing University
Input: 3AC of P Output: BBs of P
(1) x = input (1) x = input
B1
(2) y = x - 1 (2) y = x - 1
(3) z = x * y (3) z = x * y
B2
(4) if z < x goto (7) (4) if z < x goto (7)
(5) p = x / y
(5) p = x / y
(6) q = p + y B3
(6) q = p + y
(7) a = q
(8) b = x + a (7) a = q
(9) c = 2a - b (8) b = x + a
B4
(10) if p == q goto (12) (9) c = 2a - b
(11) goto (3) (10) if p == q goto (12)
(12) return B5 (11) goto (3)
B6 (12) return
Yue Li @ Nanjing University
Control Flow Graph (CFG)
• The nodes of CFG are basic blocks
• There is an edge from block A to block B if and only if
Yue Li @ Nanjing University
Control Flow Graph (CFG)
• The nodes of CFG are basic blocks
• There is an edge from block A to block B if and only if
… …
A (j) goto (i)
B (i)
… …
Yue Li @ Nanjing University
Control Flow Graph (CFG)
• The nodes of CFG are basic blocks
• There is an edge from block A to block B if and only if
… … … …
A (j) goto (i) A (j) if e goto (i)
… …
B
B (i)
… …
Yue Li @ Nanjing University
Control Flow Graph (CFG)
• The nodes of CFG are basic blocks
• There is an edge from block A to block B if and only if
- There is a conditional or unconditional jump from the end
of A to the beginning of B
… … … …
A (j) goto (i) A (j) if e goto (i)
… …
B
B (i)
… …
Yue Li @ Nanjing University
Control Flow Graph (CFG)
• The nodes of CFG are basic blocks
• There is an edge from block A to block B if and only if
- There is a conditional or unconditional jump from the end
of A to the beginning of B
… … … … … …
A (j) goto (i) A (j) if e goto (i) A … …
… … … …
B B
B (i)
… …
Yue Li @ Nanjing University
Control Flow Graph (CFG)
• The nodes of CFG are basic blocks
• There is an edge from block A to block B if and only if
- There is a conditional or unconditional jump from the end
of A to the beginning of B
- B immediately follows A in the original order of instructions
… … … … … …
A (j) goto (i) A (j) if e goto (i) A … …
… … … …
B B
B (i)
… …
Yue Li @ Nanjing University
Control Flow Graph (CFG)
• The nodes of CFG are basic blocks
• There is an edge from block A to block B if and only if
- There is a conditional or unconditional jump from the end
of A to the beginning of B
- B immediately follows A in the original order of instructions
… … … … … …
A (j) goto (i) A (j) if e goto (i) A (j) goto (i)
… … … …
B B
B (i)
… …
Yue Li @ Nanjing University
Control Flow Graph (CFG)
• The nodes of CFG are basic blocks
• There is an edge from block A to block B if and only if
- There is a conditional or unconditional jump from the end
of A to the beginning of B
- B immediately follows A in the original order of instructions
and A does not end in an unconditional jump
… … … … … …
A (j) goto (i) A (j) if e goto (i) A (j) goto (i)
… … … …
B B
B (i)
… …
Yue Li @ Nanjing University
Control Flow Graph (CFG)
• The nodes of CFG are basic blocks
• There is an edge from block A to block B if and only if
- There is a conditional or unconditional jump from the end
of A to the beginning of B
- B immediately follows A in the original order of instructions
and A does not end in an unconditional jump
• It is normal to replace the jumps to instruction labels by
jumps to basic blocks
… … … …
A (j) goto (i) A goto B
B (i) B
… … … …
Yue Li @ Nanjing University
(1) x = input x = input
B1 B1
(2) y = x - 1 y = x - 1
(3) z = x * y z = x * y
B2 B2
(4) if z < x goto (7) if z < x goto B4
(5) p = x / y
B3 B3 p = x / y
(6) q = p + y q = p + y
(7) a = q a = q
(8) b = x + a b = x + a
B4 B4
(9) c = 2a - b c = 2a - b
(10) if p == q goto (12) if p == q goto B6
B5 (11) goto (3) B5 goto B2
B6 (12) return B6 return
Yue Li @ Nanjing University
Add edges in CFG
x = input
B1
y = x - 1
z = x * y
B2
if z < x goto B4
B3 p = x / y
q = p + y
a = q
b = x + a
B4
c = 2a - b
if p == q goto B6
B5 goto B2
B6 return
Yue Li @ Nanjing University
Add edges in CFG
x = input
B1 There is a conditional or unconditional jump
y = x - 1
from the end of A to the beginning of B
z = x * y
B2
if z < x goto B4
B3 p = x / y
q = p + y
a = q
b = x + a
B4
c = 2a - b
if p == q goto B6
B5 goto B2
B6 return
Yue Li @ Nanjing University
Add edges in CFG
x = input
B1 There is a conditional or unconditional jump
y = x - 1
from the end of A to the beginning of B
z = x * y
B2
if z < x goto B4
B3 p = x / y
q = p + y
a = q
b = x + a
B4
c = 2a - b
if p == q goto B6
B5 goto B2
B6 return
Yue Li @ Nanjing University
Add edges in CFG
x = input
B1 There is a conditional or unconditional jump
y = x - 1
from the end of A to the beginning of B
z = x * y
B2 B immediately follows A in the original
if z < x goto B4 order of instructions and A does not end in
an unconditional jump
B3 p = x / y
q = p + y
a = q
b = x + a
B4
c = 2a - b
if p == q goto B6
B5 goto B2
B6 return
Yue Li @ Nanjing University
Add edges in CFG
x = input
B1 There is a conditional or unconditional jump
y = x - 1
from the end of A to the beginning of B
z = x * y
B2 B immediately follows A in the original
if z < x goto B4 order of instructions and A does not end in
an unconditional jump
B3 p = x / y
q = p + y
a = q
b = x + a
B4
c = 2a - b
if p == q goto B6
B5 goto B2
B6 return
Yue Li @ Nanjing University
Add edges in CFG
x = input
B1 There is a conditional or unconditional jump
y = x - 1
from the end of A to the beginning of B
z = x * y
B2 B immediately follows A in the original
if z < x goto B4 order of instructions and A does not end in
an unconditional jump
B3 p = x / y We say that A is a predecessor of B, and
q = p + y B is a successor of A
a = q
b = x + a
B4
c = 2a - b
if p == q goto B6
B5 goto B2
B6 return
Yue Li @ Nanjing University
Add edges in CFG
x = input
B1 There is a conditional or unconditional jump
y = x - 1
from the end of A to the beginning of B
z = x * y
B2 B immediately follows A in the original
if z < x goto B4 order of instructions and A does not end in
an unconditional jump
B3 p = x / y We say that A is a predecessor of B, and
q = p + y B is a successor of A
a = q Usually we add two nodes, Entry and Exit.
b = x + a - They do not correspond to executable IR
B4
c = 2a - b - An edge from Entry to the BB containing the first
if p == q goto B6 instruction of IR
- An edge to Exit from any BB containing an
B5 goto B2 instruction that could be the last instruction of IR
B6 return
Yue Li @ Nanjing University
Entry
Add edges in CFG
x = input
B1 There is a conditional or unconditional jump
y = x - 1
from the end of A to the beginning of B
z = x * y
B2 B immediately follows A in the original
if z < x goto B4 order of instructions and A does not end in
an unconditional jump
B3 p = x / y We say that A is a predecessor of B, and
q = p + y B is a successor of A
a = q Usually we add two nodes, Entry and Exit.
b = x + a - They do not correspond to executable IR
B4
c = 2a - b - An edge from Entry to the BB containing the first
if p == q goto B6 instruction of IR
- An edge to Exit from any BB containing an
B5 goto B2 instruction that could be the last instruction of IR
B6 return
Exit
Yue Li @ Nanjing University
Entry
x = input
B1
y = x - 1
(1) x = input
(2) y = x - 1 z = x * y
B2
(3) z = x * y if z < x goto B4
(4) if z < x goto (7)
(5) p = x / y B3 p = x / y
(6) q = p + y q = p + y
Input: 3AC of P Output: CFG of P
(7) a = q
a = q
(8) b = x + a
b = x + a
(9) c = 2a - b B4
c = 2a - b
(10) if p == q goto (12)
if p == q goto B6
(11) goto (3)
(12) return B5 goto B2
B6 return
Yue Li @ Nanjing University Exit
Recommended Reading
Pinjia He @ CUHK Shenzhen