Hacker News new | ask | show | jobs
by jerf 1241 days ago
You seem to have just blipped by the deobfuscation from ChatGPT being actively wrong.

I meant what I said. I expect ChatGPT would happily output a substring search algorithm for the accept loop of an HTTP server if you just put enough "haystack" and "needle" words in the obfuscated code. How are you supposed to "refine" that into the truth?

To the extent that there is an answer, the answer is, completely ignore the ChatGPT output and use existing tools. Which is to say, ChatGPT would be worse than useless at that point.

I'm not saying ChatGPT will be slightly off, and maybe the obfuscator can kick it to be another 5 or 10% wrong. I'm saying, it is likely trivial to update the obfuscator to make ChatGPT utterly wrong, in every detail, up to and including the entire fundamental nature of the code.

1 comments

When it gets it entirely wrong that will be trivially detected by an I/O example, no? So I don't see that as dangerous, just inconvenient (it sometimes doesn't work, but you know when it doesn't work). You can also use an existing semantics-preserving deobfuscator and then use that as the input to an LLM deobfuscator instead of the original.

If you're saying that obfuscators can eventually adapt, then sure. So can deobfuscators. This particular problem is kind of inherently an arms race.