| Nice visualization of the ELF headers. However, the article has a few inaccuracies: - ELF files are used not only for executables, but also object files, shared libraries, and also coredumps. Different parts of the ELF format serve different purposes, although there is a lot of overlap. - The program headers don't state the location of .text, but indicate the area of the file that should be mapped into memory. - Dynamic linking doesn't require section headers. The dynamic loader (ld.so) parses the program headers for a PT_DYNAMIC entry, which refers to the .dynamic section (which in turn refers to .dynsym, .dynstr, .rela.dyn, .init_array, etc.). - Relocation sections (what is a relocation symbol?) are required for static linking, where every section with relocations gets its own relocation section, so .text gets .rela.text. Also, in object files, sections must use relocations to refer to other sections. Executables don't need to have relocations. - The alignment of PT_LOAD segments must be at least the page size: the kernel or loader will use mmap to map the file, so alignments smaller than the page size won't work. - The first section table entry must be of type SHT_NULL. The magic value SHN_UNDEF (=0) is used to refer to undefined symbols, so referring to the first section in, e.g., the symbol table, is not possible. Although not required for a minimal file, any "modern" ELF executable should have a PT_GNU_STACK program header with flags read+write, otherwise the stack will get mapped as executable memory region, thereby creating a large and often avoidable attack vector. |
> The program headers don't state the location of .text, but indicate the area of the file that should be mapped into memory.
Specifically, the PT_LOAD segment does that. Other segments are used for other purposes. Linkers generally don't generate ELFs with PT_LOAD segments covering the section header table but one could patch the ELF so that the last PT_LOAD segment covers the table or even the entire file. That way the location of the .text section becomes reachable to the running program via the section header table.
There's also this surprisingly useful PT_NULL segment type. They're essentially just placeholders with undefined program header structure contents. Excellent targets for patching. Scripting the linker to output these segments proved to be quite difficult so I just asked for a linker command line option instead. LLVM and GNU ld weren't interested but mold quickly added this feature.
A PT_NULL segment allows patching in a PT_LOAD segment for any data or metadata the programmer needs. It's also possible to create custom segments just like GNU did since there's a truly massive numeric range reserved just for that. These two facts enable some really cool stuff:
https://www.matheusmoreira.com/articles/self-contained-lone-...
> The alignment of PT_LOAD segments must be at least the page size: the kernel or loader will use mmap to map the file, so alignments smaller than the page size won't work.
In addition to that, they must also be sorted! For some weird reason, PT_LOAD segments cannot be in arbitrary order even if they don't overlap.
Violating these requirements causes some truly excruciating crashes. The executable would somehow segfault before a single instruction executed. This uber segfault brought the likes of GDB to its knees and I was reduced to pasting readelf output on stackoverflow.