1.
2 Dynamic Reverse Engineering
Key Terminology and Their Meanings (GDB and Ghidra)
Debugging Terminology
Software Breakpoints
A debugger replaces an instruction with a breakpoint, causing the CPU to raise a trap when it is
encountered. The original instruction is restored once the breakpoint is handled.
Hardware Breakpoints
These breakpoints are set using CPU debug registers and allow monitoring of memory or
instruction execution without modifying the code. They are limited in number but more stealthy.
Data Watchpoints
Used to track changes to memory locations and notify the debugger when specific data is
accessed or modified.
Stepping
Allows controlled execution of a program, moving step-by-step through machine instructions or
source code lines.
Process Control
Provides the ability to start, stop, suspend, and resume execution of the debugged program or its
threads.
Inspection & Modification
Debuggers can inspect and modify the program’s memory and registers, enabling runtime
manipulation of values.
Signal/Exception Interception
The debugger can catch signals or exceptions generated by the program, helping to analyze
faults or unexpected behaviors.
Checkpointing
Saves the process state at a specific point, allowing rollback and analysis without restarting from
scratch.
Anti-Debugging Techniques
1. Direct Evidence Detection
The program checks for debugger-related artifacts in its memory or environment, such as
querying OS structures (e.g., Windows PEB, Linux /proc/status ) or searching for known
debugger processes.
2. Behavior and Timing-Based Detection
Debuggers introduce execution delays. Programs measure execution times of specific
operations or exceptions to detect interference.
3. Debugger Blocking Techniques
Some programs prevent debugger attachment by self-debugging or using a circular
debugging setup, where multiple processes monitor each other to prevent intrusion.
Sniffing with strace and ltrace
Overview
Both strace and ltrace are command-line tools used in reverse engineering to monitor an
application's interactions with the system. They help in understanding how a program behaves
without modifying its execution.
strace – System Call Monitoring
strace captures and displays system calls made by a running process, including their
arguments and return values. It provides insights into how a program interacts with the operating
system.
Use Cases
Tracking file operations ( open , read , write ).
Monitoring memory allocation ( mmap , brk ).
Analyzing process management ( fork , execve ).
Example Usage
strace ls
This command traces all system calls made by the ls command.
ltrace – Library Call Monitoring
ltrace monitors library function calls made by a program to dynamically linked libraries (e.g.,
libc ). It helps to identify how an application interacts with external libraries.
Use Cases
Observing calls to standard C library functions ( printf , malloc ).
Understanding program logic based on API usage.
Debugging dynamically linked binaries.
Example Usage
ltrace ls
This command traces all library function calls made by the ls command.
Comparison of strace vs ltrace
Feature strace ltrace
Focus System calls Library function calls
Purpose OS interaction monitoring API usage tracking
Dependency None Requires .plt relocations
Q&A part
Non-Determinism and Record-Replay (Slide 6)
Common Sources of Non-Deterministic Behavior
1. Environment dependencies:
Factors like disk space, network availability, time, and random number generators affect
execution.
2. Security mechanisms:
Techniques like Address Space Layout Randomization (ASLR) change memory
addresses on each execution.
3. Multithreading behavior:
Thread synchronization introduces timing variations, making program behavior
unpredictable.
Impact on Dynamic Analysis
Non-determinism hampers debugging by making program behavior inconsistent across runs,
leading to difficulties in reproducing issues.
Role of Record-Replay
Record-replay captures execution traces during a run, allowing reverse engineers to
deterministically reproduce the exact same execution, enabling effective debugging and script
development.
Instrumentation (Slides 7-14)
What is Instrumentation?
Instrumentation is the insertion of code snippets into a program to collect execution data.
Goals of Instrumentation
Tracing – Monitoring function calls and data flows.
Security enforcement – Detecting suspicious behavior.
Performance analysis – Collecting runtime statistics.
Different Software Representations for Injection
Instrumentation can be applied at:
Source code level (before compilation).
Binary level (modifying compiled executables).
Intermediate representations (during compilation).
Abstraction Levels of Reporting
Instrumentation can report at different levels:
Instruction-level: Individual operations.
Basic block-level: Code sequences without branching.
Function-level: Entry and exit of functions.
Configuration of PIN-Like Instrumentation
Instrumentation code: Specifies where and how code is injected.
Analysis code: Defines what data is collected at runtime.
PIN instruments code at runtime using a virtual machine, JIT compiler, and code cache,
ensuring minimal performance overhead.
Dynamic Crypto Key Localization Attack (Slides 18, 19, 21, 22, 23)
Steps of the Attack
1. Identify Crypto Basic Blocks
Use heuristics like instruction patterns (bitwise operations) and execution count scaling.
2. Expand Detected Blocks
Analyze data dependencies to identify memory load operations.
3. Differentiate Key and Data Loading
Consider source (e.g., files vs. random number generators), buffer sizes, and value
consistency across runs.
4. Extract Key Values
Set breakpoints in a debugger and retrieve operand values during execution.
Emulation vs. Other Dynamic Analysis Techniques (Slide 27)
Emulation is preferred when:
The original environment is unavailable.
Fine-grained control over execution is required.
Avoiding detection by anti-debugging mechanisms is needed.
Differential Analysis (Slide 30)
Definition
Compares different executions of a program under varying conditions to detect differences.
Targets of Analysis
Input variations: Analyze how input changes affect behavior.
Execution environment: Observe effects of system-level differences.
Examples
Identifying control flow changes based on input.
Comparing binaries before and after patching.
Inputs for Disassemblers (Slide 31)
Disassemblers can take the following inputs:
Executable binaries (e.g., ELF, PE).
Memory dumps captured from a running process.
Partial binary code extracted from firmware or disk images.
Fuzzing (Slide 32)
Definition
Fuzzing involves providing random or semi-random inputs to a program to discover
vulnerabilities.
Types of Fuzzing
1. Black-box fuzzing: No prior knowledge of the application.
2. White-box fuzzing: Uses program structure knowledge.
3. Grey-box fuzzing: A hybrid approach leveraging partial knowledge.
Symbolic and Concolic Execution (Slides 33-36)
Core Concepts
Path conditions: Mathematical constraints representing program execution paths.
Path exploration: Attempting to cover all possible execution paths.
Path Explosion Problem
The number of paths grows exponentially, making exhaustive analysis impractical.
Mitigating Path Explosion
Using concrete inputs to guide execution along selected paths while maintaining symbolic
tracking.
Limitations of Static and Dynamic Analysis (Slide 39)
Analysis Type Limitations
Static Analysis Cannot capture runtime behavior, limited by obfuscation.
Dynamic Analysis Limited to observed execution paths, performance overhead.
Hybrid Analysis
Combining both methods to leverage the strengths of each, such as running static analysis for
structural insights and dynamic analysis for runtime behavior validation.
Example: Using static analysis to identify critical functions and dynamic analysis to monitor their
runtime execution.