Hacker News new | ask | show | jobs
by LazerBear 4407 days ago
This made me wonder if it's possible to automatically reverse engineer a small binary file into human readable (and understandable) source code. Assuming you know the language and compiler used (and all of its quirks and optimizations), and considering that human written programs aren't so random and their patterns are most likely predictable, I think it should be possible though not at all trivial. Are there any projects attempting this?
2 comments

Labels and comments go a long way to making an assembly project readable. I don't know how an automatic tool could interpret the human intention behind a label.
It's not that difficult, but I think the main obstacle is gathering and representing the collection of knowledge in a useful form.

There was a disassembler called Sourcer that would annotate code with a set of predetermined comments based on its knowledge of the PC hardware. For example a sequence of instructions that enabled the interrupt controller by setting a specific bit in its register would be identified and get the comment "enable interrupt controller". I seem to remember IDA can do the same thing, although it's been a while since I last used it.

The Hex-Rays Decompiler attempts to do this for analysis purposes (that is, the resulting code is intended to be human-readable, but not necessarily compiled): https://www.hex-rays.com/products/decompiler

Unfortunately, I've not sprung for a license, so I can't comment on its efficacy.