Hacker News new | ask | show | jobs
by dmvinson 2383 days ago
I respect your point and semi-agree, but as someone who ran a small business in high school that usually involved reverse engineering obfuscated Javascript, I think you're overstating how hard it is to follow the logic of Javascript blobs. Yes, whole program flows can be insanely difficult to follow, but narrowing in on the logic of key functions is often what one needs when trying to learn from other's code.
1 comments

So are you saying that reverse engineering javascript is easy? Or that it's easier than reverse engineering wasm?

I don't know much about web assembly, but x86, which is much more complicated with thousands of instructions, has been successfully reverse engineered basically since forever. There are decompilers that can automatically reconstruct source code in C or C++ from a binary blob.

Compared to javascript, the best you can hope for is to just format the code so its in a more readable structure, but that isn't going to untangle purposefully obfuscated logic. Add to that the fact that even a regular javascript program is an untyped mess, and it becomes clear that anyone specifically trying to confuse readers will have a very easy time of doing so. There are a lot of messy things you can do in javascript, almost COBOL levels of messy.

Also, I'm curious about this

> but as someone who ran a small business in high school that usually involved reverse engineering obfuscated Javascript,

What type of clients paid you to reverse engineer obfuscated javascript? Malware research? Something else?

> I don't know much about web assembly, but x86, which is much more complicated with thousands of instructions, has been successfully reverse engineered basically since forever. There are decompilers that can automatically reconstruct source code in C or C++ from a binary blob.

That's a bit of an overstatement.

Disassembly of native executables is essentially a solved problem, and has been for decades. There is some variation in terms of how you define disassembly and how you deal with code that specifically tries to defeat disassembly, but it's solved enough that objdump -d is a decently effective tool.

Decompilation is more difficult. There were academic-quality decompilers by around the 90s, but these weren't really usable and tended to break on anything more complicated than toy examples. The JVM breathed new life into decompilers, and it's not until this point that you get decompilers that can routinely output code that is recompilable (and only in the Java domain).

In the mid-noughts, decompilation efforts returned to targeting native binaries again. This is helped by the developers of IDA Pro (the main tool used for reverse engineering) building a decompiler view into their application. There's also been more efforts on accurate static binary translation into IRs such as LLVM, which is often close enough to C to be effective, and I'm more familiar with these efforts than I am with full decompilers.

The creation of fully recompilable C source code from binaries is still a challenge, in part because machine semantics are more well-defined than C, and you basically have a tradeoff between readable output and semantically-correct (free of undefined behavior). Control-flow recovery is still challenging; signatures are needed to deal with statically-linked pieces of the standard library; and structure and type recovery is routinely of extremely poor quality.

But if the WASM was compiled from javascript or some other language that isn't very C-like then the de-compiled C or C++ code is going to be very difficult to follow.

At the very least with obfuscated javascript you are going from js => js => js. Rather then from js => WASM => C++.

But going to c instead of js is an implentation technicality. If WASM becomes commonplace I'd be shocked if some group of kind souls doesn't open source a decompiler to minified js .
There already is a compiler from wasm to asm.js, so I assume it wouldn't be too difficult to go from wasm to more typical js.
Good to know!
You're not going from JS to WASM, you're going from C/C++/Rust to WASM.