Elfy is an ELF parsing library that uses mmap (multi-platform) to map files into memory for reading. It supports both 32-bit and 64-bit ELF formats and handles little/big-endian byte ordering.
First, you must fetch the elfy library for your project:
zig fetch --save git+https://github.com/rawC1nnamon/zig.elfy#masterThen, after b.addExecutable(...), add the following code to your build.zig:
const elfy = b.dependency("elfy", .{
.target = target,
.optimize = optimize,
});
// Where 'exe' is your project executable
exe.root_module.addImport("elfy", elfy.module("elfy"));Elfy offers a rich API that supports parsing the most relevant ELF data. The binary can be initialized in two modes: .ReadOnly or .ReadWrite. The latter allows basic section content modification and the creation of a new ELF with these changes.
const std = @import("std");
const elfy = @import("elfy");
pub fn main() !void {
const allocator = std.heap.page_allocator;
// The binary must be a variable
var binary = try elfy.Elf.init("/bin/cat", .ReadOnly, allocator);
defer binary.deinit();
const header = binary.getHeader();
std.debug.print("Entry {x}, Machine: {s}\n", .{ header.getEntryPoint(), @tagName(header.getMachine()) });
// Iterator return values must also be variables
var sections = try binary.getIterator(elfy.ElfSection);
while (try sections.next()) |section| {
const name = try binary.getSectionName(section);
const section_type = section.getType();
std.debug.print("Name: {s}, Type: {s}\n", .{ name, @tagName(section_type) });
}
}Elf is the struct that contains the API and can be accessed this way: elfy.Elf. You must initialize the binary using the init function: Elf.init(path: []const u8, mode: MapMode, allocator: Allocator) (make sure to deinit the binary using defer binary.deinit()).
Elfy uses a hybrid parsing method (lazy and eager). Lightweight data, such as sections, string tables, and symbol names, are processed during init. On the other hand, data like symbol content, dynamic symbols, and relocations are lazily parsed using an Iterator, which you can obtain with the following function: try binary.getIterator(...). This function accepts iterable data structures, which include:
ElfProgram: Program/segment headers.ElfSection: Section headers.ElfSymbol: Symbol sections.ElfDynamic: Dynamic symbol sections.ElfRelocation: Relocation sections.
// elfy.ElfProgram, elfy.ElfSection, elfy.ElfSymbol ...
var sections = try binary.getIterator(elfy.ElfSection);
while (try sections.next()) |section| {
// ...
}The iterator returned by getIterator(...) contains a next() function to retrieve the next data structure (you should use a while loop to iterate) and a reset() function to reset the iterator. Additionally, the iterator includes flow control fields like index, remaining, count, etc. Avoid modifying these fields.
Lastly, the binary instance returned by init(...) contains useful functions that facilitate parsing complex structures:
createElf(name: []const u8) ElfError!voidgetHeader() ElfHeadergetIterator(comptime T: type) ElfError!Iterator(T)getSectionName(section: ElfSection) ElfError![]const u8getSectionData(section: ElfSection) ElfError![]const u8getSectionByIndex(index: u64) ElfError!ElfSectiongetSectionByName(name: []const u8) ElfError!ElfSectiongetSectionByType(section_type: SectionType) ElfError!ElfSectiongetSectionDataByName(name: []const u8) ElfError![]const u8modifySectionData(section: ElfSection, data: []const u8) ElfError!voidgetSymbolName(symbol: ElfSymbol) ElfError![]const u8getDynName(dynamic: ElfDynamic) ElfError!?[]const u8getRelocationLinkedSymbol(relocation: ElfRelocation, reloc_index: u64) ElfError!ElfSymbol
Elfy contains six data structures: ElfHeader, ElfProgram, ElfSection, ElfSymbol, ElfDynamic, and ElfRelocation. These are tagged unions that contain both 32-bit and 64-bit fields, but you don't need to use switch (...) to access the content. Each data structure has methods to retrieve information (see types). For example:
var symbols = try binary.getIterator(elfy.ElfSymbol);
while (try symbols.next()) |symbol| {
_ = symbol.getInfo();
_ = symbol.getSize();
_ = symbol.getBind();
_ = symbol.getType();
_ = symbol.getVisibility();
// ...
}There is one special case: relocation.getType(...) receives a machine value as a parameter (for example, header.getMachine()) and returns a tagged union containing the relocation type for each architecture. You must use switch (reloc_type) to unpack it. For example:
const header = binary.getHeader();
var relocations = try binary.getIterator(elfy.ElfRelocation);
while (try relocations.next()) |relocation| {
// relocation.getType() receives the ELF machine as parameter
const linked_symbol = try binary.getRelocationLinkedSymbol(relocation, relocations.index);
const symbol_name = try binary.getSymbolName(linked_symbol);
const reloc_type = try relocation.getType(header.getMachine());
switch (reloc_type) {
.X86_64 => |t| std.debug.print("Type: {s}, Symbol Name: {s}\n", .{ @tagName(t), symbol_name }),
// You can add more architectures if needed (ARM, RISCV, SPARC, etc.)
else => std.debug.print("[!] Unsupported architecture", .{}),
}
}Supported relocation types are:
@"386"S390ARMPPCPPC64MIPSALPHAAARCH64LOONGARCHRISCVSPARCX86_64
With the binary opened in .ReadWrite mode, you can modify the content of a section as long as the new content has a length less than or equal to the original buffer. For example:
const buf: []const u8 = &[_]u8{
0x48, 0xc7, 0xc0, 0x01, 0x00, 0x00, 0x00,
0x48, 0xc7, 0xc7, 0x01, 0x00, 0x00, 0x00,
0x48, 0x8d, 0x35, 0x0a, 0x00, 0x00, 0x00,
0x48, 0xc7, 0xc2, 0x0c, 0x00, 0x00, 0x00,
0x0f, 0x05, 0x48, 0xc7, 0xc0, 0x3c, 0x00,
0x00, 0x00, 0x48, 0xc7, 0xc7, 0x00, 0x00,
0x00, 0x00, 0x0f, 0x05, 0x48, 0x65, 0x6c,
0x6c, 0x6f, 0x20, 0x77, 0x6f, 0x72, 0x6c,
0x64,
0x0a,
// ...
};
const text = try binary.getSectionByName(".text");
try binary.modifySectionData(text, buf);
try binary.createElf("new_binary");I created this library with the features I needed. If you're looking for a data structure that this parser doesn't contain (e.g., ElfNote, relr, etc.), you can create custom functions to parse it more easily. For example, if you want to parse notes:
fn parseNote(note: []const u8) !elf.Elf64_Nhdr { ... }
// ...
var sections = try binary.getIterator(elfy.ElfSection);
while (try sections.next()) |section| {
const raw_note = if (section.getType() == .SHT_NOTE) try binary.getSectionData(section);
const note = try parseNote(raw_note);
}