| > Decompilation is often the least important (and least reliable) part of IDA/Ghidra This is something all people using decompilers say and sort of shows how low is trust towards decompilers. Expectations have always been rather low. I've been there, but this does not have to be the case, the whole reason why we started rev.ng is to prove that expectations can be raised. Apart from accuracy, which is difficult but engineering work, why don't decompilers emit syntactically valid C? Have you ever tried to re-compile code from any decompiler? It's a terrible experience. rev.ng only emits valid C code, and we test it with a bunch of -Wall -Wextra: https://github.com/revng/revng-c/blob/develop/share/revng-c/... Other key topic: data structures. When reversing I spend half of the time renaming things and half of the time detecting data structures. The help I get from decompilers in latter is basically none. rev.ng, by default, detects data structures on the whole binary, interprocedurally, including arrays. See the linked list example in the blog post. We also have plans to detect enums and other stuff. Clearly we're not there yet, we still need to work on robustness, but our goal is to increase the confidence in decompilers and actually offer features that save time. Certain tools have made progress in improving the UI and the scripting experience, but there's other things to do beyond that. I see this a bit like the transition from the phase in which C developers where using macros to ensure things were being inlined/unrolled to the phase where they stopped doing that because compilers got smart enough to the right thing and to do it much more effectively. |
I don't want to look at assembly code. I'd rather see expression trees, expressed in C-like syntax, than trying to piece together variables from two-address or three-address instructions. Looking at assembly tends to lead to brain farts like "wait, was the first or second operand the output operand?" (really, fuck AT&T syntax) or "wait, does ja implement ugt or sgt?"
So that means I want to look at something vaguely C-like. But the problem is that the C type system is too powerful for decompilers to robustly lift to, and the resulting code is generally at best filled with distractions of wait-I-can-fix-this excessive casting and at worst just wrong. And when it's wrong, I have to resort to staring at the assembly, which (for Ghidra at least) means throwing away a lot of the notes I've accumulated because they don't correlate back to underlying assembly.
So what I really want isn't something that can emit recompilable C code, that's optimizing for something that doesn't help me in the end. What I want is robust decompilation to something that lets me ignore the assembly entirely. I'm a compiler writer, I can handle a language where integers aren't signed but the operands are.