|
|
|
|
|
by boricj
642 days ago
|
|
> You can't use individual functions without knowing the data structures they operate on, and info about data structures is usually not present in the final binary. It turns out you can. Linkers do not care about types or data structures, all they do is lay out sections (a fancy name for arrays of bytes) inside a virtual address space and fix up relocations. I've written case studies on my blog where I've successfully carved out whole pieces out of a program and reuse them without reverse-engineering them. I've even made a native port of a ~100 KiB Linux a.out x86 proprietary program from 1995 to Windows and all I had to do was write 50 lines of C to thunk between the glibc 1.xx-flavored object file and the MinGW runtime. One user of my tooling managed to delink a 7 MiB executable for a 2009 commercial Windows video game in six weeks (including the time required to fix all the bugs in my alpha-quality COFF exporter and x86 analyzer), leaving out the C standard library. They then relinked it at a different base address and the new executable works so well it's functionally indistinguishable from the original one. They didn't reverse-engineer the thousands of functions or the hundreds of kilobytes of data inside that program to do so. This is complete heresy according to conventional computer sciences, which is why you can't apply it here. I'd be happy to talk at length about it, but without literature on the topic I'm forced to explain this from basic principles every time and Hacker News isn't the place to write a whole thesis. |
|
If I were to guess, you're saying that you reverse engineer the API boundary without reverse engineering the implementation. But then figuring out what the API contact is without documentation seems intractable for most API boundaries.