Hacker News new | ask | show | jobs
by poincaredisk 430 days ago
I'm surprised by this. As a professional reverse engineering I've actually found LLMs to be terrible at deobfuscation of JS (especially in the context of JS malware). But maybe my requirements are higher and it's actually OK for occasional use against weak packers?
2 comments

Have you seen this?

https://github.com/jehna/humanify

What they do is ground the LLM to the AST with Babel to ensure you still get the same shape of AST out of your deobfuscation pass. Probably this tool could be cleaned up, made to work with multiple llm and parser backends, have its prompts improved, &c.

This is great idea! But it's more about having LLMs to give function & variables names, instead of having LLM to deobfuscate. The (traditional) deobfuscations (e.g. unpack, de-flatten, de-virtualization etc) were done by 100% precise human made Babel plugins and is totally unrelated to a LLM.
I've used it for small files and it did very well prettifying, naming the variables and adding comments for context. But I can imagine it doing a bad job with large files.