|
|
|
|
|
by moyix
1241 days ago
|
|
I don't really see this as a problem. Once you have a first cut deobfuscation from this you can refine it with other methods, like comparing input/output examples between the original and the deobfuscated version, or even use something more sophisticated like symbolic execution [1] or differential fuzzing [2] to systematically look for divergence between the behavior of the two. You could even feed these back in to ChatGPT and ask it to redo the deobfuscation given a failing test case. Such testing won't be able to prove that the two are equivalent (unless it's exhaustive) but with decent coverage of the original you can get some good confidence. The goal of deobfuscation is usually understanding, so I'm not sure you need strong guarantees of perfect semantic equivalence with no human intervention/judgment. And of course, existing deobfuscators have bugs and aren't guaranteed to preserve semantics either. [1] https://en.wikipedia.org/wiki/Symbolic_execution [2] https://en.wikipedia.org/wiki/Differential_testing |
|
I meant what I said. I expect ChatGPT would happily output a substring search algorithm for the accept loop of an HTTP server if you just put enough "haystack" and "needle" words in the obfuscated code. How are you supposed to "refine" that into the truth?
To the extent that there is an answer, the answer is, completely ignore the ChatGPT output and use existing tools. Which is to say, ChatGPT would be worse than useless at that point.
I'm not saying ChatGPT will be slightly off, and maybe the obfuscator can kick it to be another 5 or 10% wrong. I'm saying, it is likely trivial to update the obfuscator to make ChatGPT utterly wrong, in every detail, up to and including the entire fundamental nature of the code.