CSE543 Computer Security
Module: Return-Oriented
Programming
Asst. Prof. Syed Rafiul Hussain
CSE543 - Introduction to Computer and Network Security Page 1
Anatomy of Control-Flow Exploits
• Two steps in control-flow exploitation
• First -- attacker gets control of program flow
(return address, function pointer)
‣ Stack (buffer), heap, format string vulnerability, …
• Second -- attacker uses control of program flow
to launch attacks
‣ E.g., Code injection
• Adversary injects malcode into victim
• E.g., onto stack or into other data region
‣ How is code injection done?
CSE543 - Introduction to Computer and Network Security Page 2
Code Injection
• Advantage
• Adversary can install any code they want
• What code do adversaries want?
‣ Defenses
• NX bit - set memory as non-executable (stack)
• W (xor) X - set memory as either writeable or
executable, but not both
• What can adversary do to circumvent these
defenses and still execute useful code (for
them)?
CSE543 - Introduction to Computer and Network Security Page 3
Return-Oriented Programming
• Arbitrary exploitation without code injection
CSE543 - Introduction to Computer and Network Security Page 4
Return-Oriented Programming
CSE543 - Introduction to Computer and Network Security Page 5
ROP Thesis
CSE543 - Introduction to Computer and Network Security Page 6
Return-to-libc
CSE543 - Introduction to Computer and Network Security Page 7
ROP vs return-to-libc
CSE543 - Introduction to Computer and Network Security Page 8
ROP Example
• Use ESP as program counter
– E.g., Store 5 at address 0x8048000 (without introducing
new code)
Code Stack
G1: pop %eax G1 Return Address
ret
5
G2: pop %ebx
G2G2
jmp
ret buf
movl %eax, (%ebx) 0x8048000
G3:
ret G3G3
jump
...
Registers Memory
%eax = 0x8048000 =
%ebx =
ROP Example
• Use ESP as program counter
– E.g., Store 5 at address 0x8048000 (without introducing
new code)
Code Stack
G1: pop %eax G1 Return Address
ret
5
G2: pop %ebx
G2G2
jmp
ret buf
movl %eax, (%ebx) 0x8048000
G3:
ret G3G3
jump
...
Registers Memory
%eax = 5 0x8048000 =
%ebx =
ROP Example
• Use ESP as program counter
– E.g., Store 5 at address 0x8048000 (without introducing
new code)
Code Stack
G1: pop %eax G1 Return Address
ret
5
G2: pop %ebx
G2G2
jmp
ret buf
movl %eax, (%ebx) 0x8048000
G3:
ret G3G3
jump
...
Registers Memory
%eax = 5 0x8048000 =
%ebx =
ROP Example
• Use ESP as program counter
– E.g., Store 5 at address 0x8048000 (without introducing
new code)
Code Stack
G1: pop %eax G1 Return Address
ret
5
G2: pop %ebx
G2G2
jmp
ret buf
movl %eax, (%ebx) 0x8048000
G3:
ret G3G3
jump
...
Registers Memory
%eax = 5 0x8048000 =
%ebx =
ROP Example
• Use ESP as program counter
– E.g., Store 5 at address 0x8048000 (without introducing
new code)
Code Stack
G1: pop %eax G1 Return Address
ret
5
G2: pop %ebx
G2G2
jmp
ret buf
movl %eax, (%ebx) 0x8048000
G3:
ret G3G3
jump
...
Registers Memory
%eax = 5 0x8048000 =
%ebx = 0x8048000
ROP Example
• Use ESP as program counter
– E.g., Store 5 at address 0x8048000 (without introducing
new code)
Code Stack
G1: pop %eax G1 Return Address
ret
5
G2: pop %ebx
G2G2
jmp
ret buf
movl %eax, (%ebx) 0x8048000
G3:
ret G3G3
jump
...
Registers Memory
%eax = 5 0x8048000 =
%ebx = 0x8048000
ROP Example
• Use ESP as program counter
– E.g., Store 5 at address 0x8048000 (without introducing
new code)
Code Stack
G1: pop %eax G1 Return Address
ret
5
G2: pop %ebx
G2G2
jmp
ret buf
movl %eax, (%ebx) 0x8048000
G3:
ret G3G3
jump
...
Registers Memory
%eax = 5 0x8048000 =
%ebx = 0x8048000
ROP Example
• Use ESP as program counter
– E.g., Store 5 at address 0x8048000 (without introducing
new code)
Code Stack
G1: pop %eax G1 Return Address
ret
5
G2: pop %ebx
G2G2
jmp
ret buf
movl %eax, (%ebx) 0x8048000
G3:
ret G3G3
jump
...
Registers Memory
%eax = 5 0x8048000 =
%ebx = 0x8048000
ROP Example
• Use ESP as program counter
– E.g., Store 5 at address 0x8048000 (without introducing
new code)
Code Stack
G1: pop %eax G1 Return Address
ret
5
G2: pop %ebx
G2G2
jmp
ret buf
movl %eax, (%ebx) 0x8048000
G3:
ret G3G3
jump
...
Registers Memory
%eax = 5 0x8048000 = 5
%ebx = 0x8048000
Machine Instructions
CSE543 - Introduction to Computer and Network Security Page 18
ROP Execution
CSE543 - Introduction to Computer and Network Security Page 19
Building ROP Functionality
CSE543 - Introduction to Computer and Network Security Page 20
Building ROP Functionality
CSE543 - Introduction to Computer and Network Security Page 21
Building ROP Functionality
CSE543 - Introduction to Computer and Network Security Page 22
Creating Programs
CSE543 - Introduction to Computer and Network Security Page 23
Finding Gadgets
CSE543 - Introduction to Computer and Network Security Page 24
ROP Conclusions
CSE543 - Introduction to Computer and Network Security Page 25
Control-Flow Integrity
• Goal: Ensure that process control follows source code
‣ Adversary can only choose authorized control-flow
sequences
• Build a model from source code that describes legal
control flows
‣ E.g., control-flow graph
• Enforce the model on program execution
‣ Instrument indirect control transfers
• Jumps, calls, returns, ...
• Challenges
‣ Building accurate model
‣ Efficient enforcement
CSE543 - Introduction to Computer and Network Security Page 26
Software Control Flow Integrity
Techniques, Proofs, & Security Applications
Jay Ligatti summer 2004 intern work with:
Úlfar Erlingsson and Martín Abadi
27
Our Mechanism
FA FB
nop IMM1
if(*fp != nop IMM1) halt
if(**esp != nop IMM2) halt
call fp
nop IMM2 return
CFG excerpt
Acall B1
NB: Need to ensure bit patterns for nops
Acall+1 Bret
appear nowhere else in code memory
28
More Complex CFGs
Maybe statically all we know is that CFG excerpt
FA can call any int int function
Acall B1
FA C1
FB succ(Acall) = {B1, C1}
nop IMM1
if(*fp != nop IMM1) halt
call fp
FC
nop IMM1
Construction: All targets of a computed jump must have
the same destination id (IMM) in their nop instruction 29
Imprecise Return Information
Q: What if FB can return
CFG excerpt
FA to many functions ?
Acall+1
A: Imprecise CFG
Bret
Dcall+1
call FB FB
nop IMM2 succ(Bret) = {Acall+1, Dcall+1}
CFG Integrity:
FD if(**esp != nop IMM2) halt
Changes to the
return PC are only to
valid successor
call FB PCs, per succ().
nop IMM2
30
No “Zig-Zag” Imprecision
Solution I: Allow the imprecision Solution II: Duplicate code
to remove zig-zags
CFG excerpt CFG excerpt
Acall B1 Acall B1
C1 C1A
Ecall Ecall C1E
31
CFG Imprecision
• Best reduced by a technique developed in the
“HyperSafe” system
‣ “HyperSafe: A Lightweight Approach to Provide Lifetime
Hypervisor Control-Flow Integrity” IEEE Symposium on
Security and Privacy, 2010
• On indirect call (forward edge)
‣ Check the proposed target against the set of legal targets
from the CFG
• On return (backward edge)
‣ Check the proposed return location against the set of legal
return locations from the CFG
• Tricky to make that efficient (see the paper)
CSE543 - Introduction to Computer and Network Security Page 32
Shadow Stack
• What should be the target of a return instruction?
‣ Return to caller
‣ But, need a way to protect return value
• Shadow stack
‣ Stack that can only be accessed by trusted code (e.g.,
software fault isolation)
‣ Off limits to overflows
CSE543 - Introduction to Computer and Network Security Page 33
CFG Computation
• What should be the target of a call instruction?
‣ Direct call - hard coded, so no problem
‣ Indirect call (function pointer) - would be any legal value for
the function pointer
• That is, anywhere it can point
• The “points-to” problem in general, which is undecidable
• So, there are various techniques to over-approximate
the target set for each indirect call
CSE543 - Introduction to Computer and Network Security Page 34
More Challenges
• Predicting return targets can be hard
‣ Exceptions, signals, and setjmp/longjmp
• Runtime generation of indirect jumps
‣ E.g., dynamically linked libraries
• Indirect jumps using arithmetic operators
‣ E.g., assembly
• Is enforcing fine-grained CFI sufficient
to prevent exploits?
CSE543 - Introduction to Computer and Network Security Page 35
Recent Result
• Suppose a program is protected by fine-grained CFG on
calls and a shadow stack on returns
• Further suppose that the program contains an
“arbitrary write primitive” (e.g., based on a memory
error)
• For these programs, exploits can be generated over
80% of the time, even against CFI defenses
‣ “Block Oriented Programming: Automating Data-Only
Attacks”, ACM CCS 2018
• Exploits follow CFG, but manipulate memory to
complete exploit
‣ Called “data-oriented programming”
CSE543 - Introduction to Computer and Network Security Page 36
Alternatives to CFI?
• What are the fundamental enablers of ROP attacks?
• (1) CFI: violate control flow
• (2) Adversary can choose gadgets
• Can we prevent adversaries from
choosing useful gadgets?
• In general, adversaries can
create/obtain the same binary as
is run by the victim
• But, that need not be the case
CSE543 - Introduction to Computer and Network Security Page 37
Apply Crypto to Code?
• Can we randomize the program’s execution in such a
way that an adversary cannot select gadgets?
• Given a secret key and a program address space,
encrypt the address space such that
• the probability that an adversary can locate a
particular instruction (start of gadget) is sufficiently
low
• and the program still runs correctly and efficiently
• Called address space randomization
CSE543 - Introduction to Computer and Network Security Page 38
ASLR
• For control-flow attacks, attacker needs absolute
addresses Stack
• Address-space Layout Randomization ???
(ASLR) randomizes base addresses
of memory segments on each
invocation of the program
‣ Attacker cannot predict absolute
addresses ??? Heap
• Heap, stack, data, text, mmap, ...
??? Data
??? Text
CSE543 - Introduction to Computer and Network Security Page 39
ASLR Implementations
• Linux
‣ Introduced in Linux 2.6.12 (June 2005)
‣ Shacham et al. [2004]:16 bits of randomization
defeated by a (remote) brute force attack in
minutes
‣ Reality: ASLR for text segment (PIE) is rarely used
• Only few programs in Linux use PIE
• Enough gadgets for ROP can be found in
unrandomized code [Schwartz 2011]
CSE543 - Introduction to Computer and Network Security Page 40
ASLR Limitations
• Attacks may leak randomization information
• Disclosure attacks
• Use buffer over-read to read unauthorized program
memory (extract code or randomizing state)
• ASLR can be bypassed by information leaks about
memory layout
‣ E.g., format string vulnerabilities
• So, what can we do?
‣ How do we avoid leaking the “key”?
CSE543 - Introduction to Computer and Network Security Page 41
Conclusion
• Control-flow attack defenses operate at two stages
‣ Prevent attacker from getting control
• StackGuard, heap sanity checks, ASLR, shadow stacks, ...
‣ Prevent attacker from using control for malice
• NX, W (xor) X, ASLR, Control Flow Integrity (CFI), ...
• For maximum security, a system may need to use a
combination of these defenses
• Q. Is subverting control-flow the only goal of an attacker?
CSE543 - Introduction to Computer and Network Security Page 42