Control Flow Integrity
Lujo Bauer
18-732
Spring 2015
Control Hijacking Arms Race
Control
Flow
Integrity
Attacks
2
[Link]
CFI: Goal
Provably correct mechanisms that prevent
powerful attackers from succeeding
by protecting against all control flow
integrity attacks
4
CFI: Idea
During program execution, whenever a
machine-code instruction transfers control,
it targets a valid destination, as determined
by a Control Flow Graph (CFG) created
ahead of time
5
Attack Model
Powerful Attacker: Can at any time arbitrarily
overwrite any data memory and (most)
registers
– Attacker cannot directly modify the PC
– Attacker cannot modify reserved registers
Assumptions:
Data memory is Non-Executable
Code memory is Non-Writable
6
Lecture Outline
CFI: Goal
Background: Control Flow Graph
CFI: Approach
Building on CFI
– IRM, SFI, SMAC, Protected Shadow Call Stack
Formal Study
7
Basic Block
A consecutive sequence of instructions / code such that
the instruction in each position always executes
before (dominates) all those in later positions, and
no outside instruction can execute between two
instructions in the sequence
control is “straight”
(no jump targets except at the beginning,
1. x = y + z no jumps except at the end)
2. z = t + i 1. x = y + z
2. z = t + i
3. x = y + z 3. x = y + z
4. z = t + i 4. z = t + i
5. jmp 1 5. jmp 1
6. jmp 3
8
CFG Definition
A static Control Flow Graph is a graph where
– each vertex vi is a basic block, and
– there is an edge (vi, vj) if there may be a transfer of
control from block vi to block vj
Historically, the scope of a “CFG” is limited to
a function or procedure, i.e., intra-procedural
9
Call Graph
Nodes are functions
There is an edge (vi, vj) if function vi calls
function vj
void orange() void red(int x) void green()
{ { {
1. red(1); .. green();
2. red(2); } orange();
3. green(); }
}
orange red green
10
Super Graph
Superimpose CFGs of all procedures over
the call graph
void orange() void red(int x) void green()
{ { {
1. red(1); .. green();
2. red(2); } orange();
3. green(); }
}
1 A context sensitive
1: red
2 super‐graph for orange
3 2: red lines 1 and 2
11
Precision
The more precise the analysis, the more
accurately it reflects the “real” program
behavior
– Limited by soundness/completeness
tradeoff
– Depends on forms of sensitivity of analysis
12
Soundness Completeness
If analysis says X is If X is true, then
true, then X is true analysis says X is true
Things I say
True things
True things
Things I say
Trivially sound: Say nothing Trivially complete: Say everything
Sound and Complete: Say exactly the set of true things!
13
Context Sensitivity
Different calling contexts are distinguished
void orange() void red(int x) void green()
{ { {
1. red(1); .. green();
2. red(2); } orange();
3. green(); }
}
Context sensitive
distinguishes 2 different
calls to red()
14
Context Sensitive Example
void id(int z) Context-Sensitive
a = id(4); 4
{ return z; }
b = id(5);
5
a = id(4); void id(int z) Context-Insensitive
4,5 { return z; } (note merging)
b = id(5);
4,5
15
Lecture Outline
CFI: Goal
Background: Control Flow Graph
CFI: Approach
Building on CFI
– IRM, SFI, SMAC, Protected Shadow Call Stack
Formal Study
16
CFI Overview
Invariant: Execution must follow a path in a control flow
graph (CFG) created ahead of run time
Method:
Build CFG statically, e.g., at compile time
Instrument (rewrite) binary, e.g., at install time
– Add IDs and ID checks; maintain ID uniqueness
Verify CFI instrumentation at load time
– Direct jump targets, presence of IDs and ID checks, ID uniqueness
Perform ID checks at run time
– Indirect jumps have matching IDs
Security Principle: Minimal Trusted Computing Base ̶
Trust simple verifier, not complex rewriter
17
Build CFG
direct calls
indirect calls
Two possible
return sites due to
context insensitivity
18
Instrument Binary
call 17, R: transfer control to R
only when R has label 17
Insert a unique number at each destination
Two destinations are equivalent if CFG contains edges
to each from the same source
19
Example of Instrumentation
Original code
Instrumented code
Abuse an x86 assembly instruction to
Jump to the destination only if insert “12345678” tag into the binary
the tag is equal to “12345678”
20
Verify CFI Instrumentation
Direct jump targets (e.g., call 0x12345678)
– Are all targets valid according to CFG?
IDs
– Is there an ID right after every entry point?
– Does any ID appear in the binary by accident?
ID checks
– Is there a check before every control transfer?
– Does each check respect the CFG?
Trust simple verifier, not complex rewriter
21
Revisiting Assumptions
UNQ: Unique IDs
– Required to preserve CFG semantics
NWC: Non-Writable Code
– Otherwise attacker can overwrite CFI dynamic check
– Not true if code dynamically loaded or generated
NXD: Non-Executable Data
– Otherwise attacker could cause the execution of data
labeled with expected ID
22
Security Guarantees
Effective against attacks based on
illegitimate control-flow transfer
– Stack-based buffer overflow, return-to-libc exploits,
pointer subterfuge
Does not protect against attacks that do not
violate the programʼs original CFG
– Data-only attacks
– Incorrect arguments to system calls
– Substitution of file names
– Incorrect logic in implementation
23
Evaluation
x86 Pentium 4, 1.8 GHz, 512MB RAM; average overhead:
16%; range: 0-45%
24
Evaluation
CFG construction + CFI instrumentation: ~10s
Increase in binary size: ~8%
Relative execution overhead:
– crafty: CFI – 45%
– gcc: CFI < 10%
Security-related experiments
– CFI protects against various specific attacks
(read Section 4.3)
25
Lecture Outline
CFI: Goal
Background: Control Flow Graph
CFI: Approach
Building on CFI
– IRM, SFI, SMAC, Protected Shadow Call Stack
Formal Study
26
SFI
CFI implies non-circumventable sandboxing
(i.e., safety checks inserted by
instrumentation before instruction X will
always be executed before reaching X)
SFI: Dynamic checks to ensure that target
memory accesses lie within a certain range
– CFI makes these checks non-circumventable
27
SMAC: Generalized SFI
SMAC: Different access checks at different
instructions in the program
– Isolated data memory regions that are only accessible
by specific pieces of program code (e.g., library
function)
– SMAC can remove NX data and NW code assumptions
of CFI
– CFI makes these checks non-circumventable
28
Example: CFI + SMAC
Non-executable data assumption no longer
needed since SMAC ensures target address
is pointing to code
29
CFI as a Foundation for
Non-circumventable IRMs
Inlined Reference Monitors (IRM) work
correctly assuming:
– Inserted dynamic checks cannot be circumvented by
changing control flow – enforced using CFI
– IRM state cannot be modified by attacker – enforced
by SMAC
30
CFI with Context Sensitivity
Function F is called first from A, then from
B; whatʼs a valid destination for its return?
– CFI will use the same tag for both call sites, but this
allows F to return to B after being called from A
– Solution 1: duplicate code (or even inline everything)
– Solution 2: use a shadow call stack
• place stack in SMAC-protected memory region
• only SMAC instrumentation code at call and return
sites modify stack by pushing and popping values
• Statically verify that instrumentation code is correct
31
Lecture Outline
CFI: Goal
Background: Control Flow Graph
CFI: Approach
Building on CFI
– IRM, SFI, SMAC, Protected Shadow Call Stack
Formal Study
32
Security Proof Outline
1. Define machine code semantics
2. Model a powerful attacker
3. Define instrumentation algorithm
4. Prove security theorem
Weakness of Abadi et al. work:
Formal study uses a simple RISC‐style assembly
language, not the x86 ISA
(cf. McCamant and Morrisett’s PittSFIeld 2006)
33
Machine Model
Execution State:
Mc (code memory): maps addresses to
words
Md (data memory): maps addresses to
words
R (registers): maps register nos. to words
pc (program counter): a word
34
Operational Semantics
For each instruction, operational semantics
defines how the instruction affects state
35
Operational Semantics (normal)
Semantics of add rd, rs, rt
: Binary relation on states that expresses normal execution steps
36
Operational Semantics (attacker)
Idea: Attacker may arbitrarily modify data
memory and most registers at any time
Formally, attacker transition captured by
binary relation on states
Transitions are either normal transitions n or
attacker transitions a
37
Instrumentation Algorithm
I(Mc): Code memory Mc is well-
instrumented according to the CFI-criteria
Example:
– Every computed jump instruction is preceded by a
particular sequence of instructions, which depends on
a given CFG
Definition of CFG and instrumentation algorithm in paper
38
CFI Security Theorem
Requires definition of transition relation ,
instrumentation algorithm I(Mc), and CFG.
Property holds in the presence of attacker
steps
Proof is by induction on execution
sequences
39
CFI Summary
Small
Invariant: Trusted
Execution Computing
must Base:
follow a path in a control
flow graph
Trust (CFG)
simple created not
verifier, ahead of run time.
complex rewriter
Method:
Build CFG statically, e.g., at compile time
Instrument (rewrite) binary, e.g., at install time
– Add IDs and ID checks; maintain ID uniqueness
Verify CFI instrumentation at load time
– Direct jump targets, presence of IDs and ID checks, ID
uniqueness
Perform ID checks at run time
– Indirect jumps have matching IDs
40
Connections to Other Lectures
Software analysis methods assume CFG accurately
reflects possible executions of program
– Software model checking (ASPIER, MOPS)
– Static analysis (Coverity Prevent)
Language-based methods
– Type systems guarantee memory and control flow safety for
programs written in that language (PCC, TAL)
– No guarantees if data memory corrupted by another entity or
flaw
Run-time enforcement methods can be circumvented if
CFG not respected
– Software-based Fault Isolation (SFI)
– Inlined Reference Monitors (IRMs)
41
Sources
Abadi et al., Control-Flow Integrity:
Principles, Implementations, and
Applications, TISSEC 2009.
Some slides from J. Ligatti, D. Brumley, A.
Datta.
42