| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dmvinson 2386 days ago
	This feels like another step away from the free and open web many people are clamoring for. Distributing opaque binaries with websites instead of Javascript is a step past even the obfuscated minified javascript files meant to be confusing. At least those can still be debugged, stepped through, and explored freely by the end user if they want to learn or reverse engineer. Is there any tool or standard being worked on to make .wasm files coherent for users who have to run the code to view websites? This feels like a step backwards in so many ways, even if it is a technological marvel.

14 comments

maeln 2386 days ago

Please, can we stop having this argument every time there is an article about webassembly ? WASM not any more obfuscated than any minified JS. Being a bytecode doesn't make you "unfree". You have access to the same tool to debug JS and WASM. And the WASM specification is open. There is literally no difference between running JS or running WASM.

strictnein 2386 days ago

A binary format is no more obfuscated than minified JS? What now?

Going to need some clarification on how that's the case.

jandrese 2386 days ago

I guess he's saying that an unminifier tool is effectively no different than a decompiler, and both are basically unreadable without it.

That said, I don't know of any webassembly decompilers, although I guess they must exist by this point. But also historically decompilers have been imperfect as some of the structure of the code is lost in the compilation process and has to be inferred, sometimes incorrectly, by the decompiler. Compare to a minifier where all you lose is the variable names, comments, and possibly helpful whitespace. All of the structure of the code is still there and there are no heuristics necessary to recreate something that resembles the original source.

cjbprime 2386 days ago

There are certainly wasm decompilers -- wasm2c, wasm2js, etc. You also have access to the browser's JS debugger for breakpoints, line by line execution control, dumping wasm's linear memory.

I haven't written any productive WebAssembly but I play Capture The Flag competitions, and it's become frequent for a wasm reverse engineering challenge to be thrown in. The tools are good enough to make that tractable, even for non-experts in wasm like me.

It helps a little that it's a stack-based rather than register-based VM. Usually more of the intent of code is preserved that way. It's like reversing a JVM class, rather than like reversing a native binary.

throwaway40324 2386 days ago

What are these Capture The Flag competitions? Do you mind posting a link?

cjbprime 2386 days ago

Sure. My favorite explanation's a short video:

https://youtu.be/8ev9ZX9J45A

pumanoir 2386 days ago

That is correct, a minified JS will preserve most of the semantics of the original program. And since the original source can come from any language, how would one know which decompiler to use.

MikeHolman 2386 days ago

You could already compile programs from other languages to javascript.

https://github.com/jashkenas/coffeescript/wiki/List-of-langu...

Wasm was effectively an extension of asm.js. It makes the experience of compile-to-web better, but it isn't much more opaque than other projects.

maeln 2386 days ago

WASM is a binary and a text format. You can turn any WASM binary to the text format and have a readable version of the blob. Firefox can automatically show you the text version of a WASM blob. So no, there is no difference with a minified JS. Just because there are in a text format don't make them any more easy to reverse engineer.

strictnein 2386 days ago

Unless I'm completely missing something the "text version" you're talking about is just WASM and there's quite a difference between that and minified JS.

ex:

   end $label121
   get_local $var7
   get_local $var9
   call $func3444
   get_local $var7
   call $func1500

coolreader18 2386 days ago

How is that any more readable than deobfuscated js?

    func1500(func3444(var7, var9), var7)

or more likely:

    gw(kl(s,i),s)

maeln 2386 days ago

Yes this is what I am talking about. Once you know the instruction, I fail to see how it is more difficult to understand than javascript.

CivBase 2386 days ago

That's like saying x86_64 assembly is as easy to understand as C. High level languages exist to make code easier to understand.

strictnein 2386 days ago

That certainly is a take.

azakai 2386 days ago

There is a difference in degree, though. Unminified JS is usually easier to read than wasm text, in general.

One practical factor: I often debug wasm files by compiling them to JS first.

At least wasm has structured control flow, which helps a lot. I wish wasm had even more readability features, personally.

Rusky 2386 days ago

It's a binary format originally based on minified JS, with a standard textual form, and which can be viewed and debugged with exactly the same tools (and ease) as minified JS.

strictnein 2386 days ago

Can you point me to an example of this?

strictnein 2386 days ago

Thanks for the downvote whoever. Honestly, I want an example of how it "can be viewed and debugged with exactly the same tools (and ease) as minified JS".

I see nothing that states that is the case.

strictnein 2386 days ago

I mean... come on:

https://medium.com/@pnfsoftware/reverse-engineering-webassem...

IshKebab 2386 days ago

WASM is essentially a more efficient version of Asm.js, which is just Javascript. WASM is a binary format. Asm.js is Javascript. They're equally obfuscated.

lukebitts 2386 days ago

Maybe I’m misunderstanding something but WASM is meant to be compiled from other languages and that source is lost, unlike minified javascript, isn’t it?

danShumway 2386 days ago

Yes, but you'd see similar issues (to a different degree) with languages like Typescript and JSX as well.

Cross-compilation has been a thing for a while now -- WASM is the followup to ASM.js, which was already being used as a compile target for languages like C.

Now, reverse engineering ASM.js is easier than reverse engineering WASM (although ASM.js is still a giant pain). And reverse engineering minified Javascript is even easier -- most competent JS engineers could debug a React project without source maps, even if it took them longer.

But it's not clear to me that WASM makes the process meaningfully harder. As in, you're still going to want to use source maps like you use today, and it'll still be totally possible to figure out what a program is doing without the original source. It'll just be a pain.

And the benefits to the web as an open, language-agnostic platform that can be used for memory-intensive tasks outweigh the downsides of needing to work harder to reverse engineer software.

remcob 2386 days ago

Would you consider the source lost if the (minified) javascript was compiled from TypeScript?

lukebitts 2386 days ago

I would. And fair enough, I think the web closed around me and I didn't even realize it. But we are all doing ok so far, so I supose wasm won't be that bad either.

maeln 2386 days ago

Minifying JavaScript is as destructive as compiling from any language to WASM. The original source are also lost when minifying JS. The fact that the transpilation/compilation target is the same langage doesn't mean its any less destructive.

linuxftw 2386 days ago

> can we stop having this argument every time there is an article about webassembly

No, we can't, it's a valid criticism, it's not going to go away. Minified JS is bad, webassmbly is worse.

maeln 2386 days ago

What is the criticism exactly? Saying WASM is worse than minified JS is factually wrong.

We can debate about minified JS if you so desire but its a different debate.

pumanoir 2386 days ago

This is a valid concern. Among many things it is much easier to audit higher level code than bytecode.

maeln 2386 days ago

What can't you do? How is it any easier to audit minified JS than a WASM blob ? Everytime this argument is raised there is no valid argument to explain why WASM would be much worse than the current state of JS.

pumanoir 2386 days ago

Minified JS (even with single letter variables and all) is still a high level language which is much easier for humans to follow than bytecode. That was only an example of it being a valid concern. I actually love WASM (specially the s-exp representation) and have implemented a compiler that compiles to it, but it's important to listen to valid concerns even if we really like a technology.

booleandilemma 2386 days ago

Two wrongs don’t make a right.

maeln 2386 days ago

Then please enlighten us and share with us what is wrong.

kick 2386 days ago

"RISC-V is open, you can't compile proprietary binaries for RISC-V."

maeln 2386 days ago

That is not what I said, don't try to create a fake argument.

You real sentence should be "RISC-V instruction set is open, therefor I can see whatever a binaries is doing via the instruction it is executing." Doesn't mean its free, doesn't mean its easy to reverse engineer whatever the binary is doing, but you have everything to do it.

dkersten 2386 days ago

In my opinion it’s been a long time since direct access to Javascript has been useful. Yes you can unminify javascript but it’s still more work than most people will go through (especially if the code was generated by a compile to javascript language)so for most people things aren’t really changing that much

hombre_fatal 2386 days ago

I don't even think source code access is the most important part of the browser's dev console. Consider the network tab instead, being able to see exactly where your bandwidth is being spent and why.

This to me makes the browser vastly preferable to native apps. I didn't realize that the desktop app I use to easily translate languages[0] sends every keystroke to Google Analytics until I had to bother installing a proxy. Meanwhile this analysis is just an Opn-Cmd-I away in the browser.

[0]: https://apps.apple.com/us/app/translate-tab/id458887729

danShumway 2386 days ago

Totally agreed -- and I don't think WASM will change any of that.

The good parts of the web in terms of debugging is the separation of concerns -- having separate interfaces for CSS, HTML, network requests, and the DOM, and having each of those interfaces be relatively inspectable.

I am a little worried about frameworks that target WASM spitting everything onto a Canvas, bypassing HTML and CSS (coughQt*cought). That would be a substantial loss for the Open web. But I don't lose any sleep over the idea of replacing Javascript.

strictnein 2386 days ago

Direct access to Javascript is useful on a daily basis for those concerned with a number of security threats.

dmvinson 2386 days ago

Yes, it is a ton of work to step through and understand minified and obfuscated code, but it is a skill that many people learn and do if there is motivation. On your second point, I think the key is that the people who do have a reason to detangle the logic of minified JS can be very impactful. I consider open viewing of Javascript as similar to noncompetes in California. It allows one to view competitors source code (if you have the motivation to work for it), which ultimately allows you to learn from and adapt their best practices. Yes, this can have negative effects, but it also may allow for a smaller company to leapfrog a larger incumbent who is too lazy to do some part of their processing server side. I could probably learn a lot about how to write (and block!) analytics tracking by reviewing the Google analytics javascript source code for example.

(Disclaimer: I've never looked at Google's analytics .js files and that may not be possible for some technical reason unknown to me)

dkersten 2386 days ago

> I think the key is that the people who do have a reason to detangle the logic of minified JS can be very impactful.

You don’t need javascript for this. People reverse engineer native binaries all the time. Reversing wasm isn’t much more difficult than minified javascript as my sibling commenter states.

vnorilo 2386 days ago

Reading webassembly is not that difficult.

The main challenge is that variable and function names are not available, but minified js is no better in that regard.

bogwog 2386 days ago

> Distributing opaque binaries with websites instead of Javascript is a step past even the obfuscated minified javascript files meant to be confusing. At least those can still be debugged, stepped through, and explored freely by the end user if they want to learn or reverse engineer.

That's not true. There's nothing "free and open" about the tracking code embedded in every modern site, or the javascript blobs you get when you visit Google or Facebook. Minified/obfuscated Javascript is no different from a binary blob, except that it's much less efficient. Your chances of reverse-engineering one of those is about the same as reverse-engineering a wasm blob. Just because one is technically "human-readable" plaintext and the other binary doesn't make a difference, since you can't actually read either of them.

dmvinson 2386 days ago

I respect your point and semi-agree, but as someone who ran a small business in high school that usually involved reverse engineering obfuscated Javascript, I think you're overstating how hard it is to follow the logic of Javascript blobs. Yes, whole program flows can be insanely difficult to follow, but narrowing in on the logic of key functions is often what one needs when trying to learn from other's code.

bogwog 2386 days ago

So are you saying that reverse engineering javascript is easy? Or that it's easier than reverse engineering wasm?

I don't know much about web assembly, but x86, which is much more complicated with thousands of instructions, has been successfully reverse engineered basically since forever. There are decompilers that can automatically reconstruct source code in C or C++ from a binary blob.

Compared to javascript, the best you can hope for is to just format the code so its in a more readable structure, but that isn't going to untangle purposefully obfuscated logic. Add to that the fact that even a regular javascript program is an untyped mess, and it becomes clear that anyone specifically trying to confuse readers will have a very easy time of doing so. There are a lot of messy things you can do in javascript, almost COBOL levels of messy.

Also, I'm curious about this

> but as someone who ran a small business in high school that usually involved reverse engineering obfuscated Javascript,

What type of clients paid you to reverse engineer obfuscated javascript? Malware research? Something else?

jcranmer 2386 days ago

> I don't know much about web assembly, but x86, which is much more complicated with thousands of instructions, has been successfully reverse engineered basically since forever. There are decompilers that can automatically reconstruct source code in C or C++ from a binary blob.

That's a bit of an overstatement.

Disassembly of native executables is essentially a solved problem, and has been for decades. There is some variation in terms of how you define disassembly and how you deal with code that specifically tries to defeat disassembly, but it's solved enough that objdump -d is a decently effective tool.

Decompilation is more difficult. There were academic-quality decompilers by around the 90s, but these weren't really usable and tended to break on anything more complicated than toy examples. The JVM breathed new life into decompilers, and it's not until this point that you get decompilers that can routinely output code that is recompilable (and only in the Java domain).

In the mid-noughts, decompilation efforts returned to targeting native binaries again. This is helped by the developers of IDA Pro (the main tool used for reverse engineering) building a decompiler view into their application. There's also been more efforts on accurate static binary translation into IRs such as LLVM, which is often close enough to C to be effective, and I'm more familiar with these efforts than I am with full decompilers.

The creation of fully recompilable C source code from binaries is still a challenge, in part because machine semantics are more well-defined than C, and you basically have a tradeoff between readable output and semantically-correct (free of undefined behavior). Control-flow recovery is still challenging; signatures are needed to deal with statically-linked pieces of the standard library; and structure and type recovery is routinely of extremely poor quality.

voxic11 2386 days ago

But if the WASM was compiled from javascript or some other language that isn't very C-like then the de-compiled C or C++ code is going to be very difficult to follow.

At the very least with obfuscated javascript you are going from js => js => js. Rather then from js => WASM => C++.

iudqnolq 2386 days ago

But going to c instead of js is an implentation technicality. If WASM becomes commonplace I'd be shocked if some group of kind souls doesn't open source a decompiler to minified js .

coolreader18 2386 days ago

There already is a compiler from wasm to asm.js, so I assume it wouldn't be too difficult to go from wasm to more typical js.

jcranmer 2386 days ago

You're not going from JS to WASM, you're going from C/C++/Rust to WASM.

6gvONxR4sf7o 2386 days ago

Meanwhile here I am running around with js off by default. Most of the internet still works. More sites work better with js off than work worse. I hope webassembly doesn't change that into a world where the "js off" analogy is "running my OS without the ability to execute programs."

adev_ 2386 days ago

> Meanwhile here I am running around with js off by default. Most of the internet still works.

Just wait 5 more years that 80% of the web switch to React / Vue / TheNewHypeSPAFramework and with or without WASM, you will be unable to browse "js off".

The blame here is not on WASM but on the abuse of client side rendering and "everything as an App" when most page are just barely interactive documents.

The Web succeeded where Flash / ActiveX / JavaApplet / Sliverlight failed because:

- it was open

- it was document oriented.

And that we tend to forget a bit too easily about it.

techntoke 2386 days ago

But now companies are paying more to push their WASM good message. Many of them actually believe it now, because they can't comprehend anything different.

toolz 2386 days ago

Who are these people clamoring for an open and free web? I've certainly never met them. The clamoring I hear is for a faster and more usable web. The vast majority of web users couldn't decipher JS even if it wasn't obfuscated.

ouid 2386 days ago

The people clamoring for a faster and more usable web are precisely the people who do not operate an adblocker.

jcranmer 2386 days ago

WebAssembly should be relatively easy to decompile. The specification guarantees structured control flow. You will lose struct information and names, but that's virtually the only information you're not going to get from decompilation.

zelly 2386 days ago

WASM just makes it easier to distribute obfuscated code, because it is obfuscated by default. Pre-WASM JS is capable of just as much obfuscation.

Minified Facebook or Google trackers were never libre or meant to be easily reversed. Web apps like Google Drive aren't free either just because you can run it in a browser on Linux. You aren't supposed to (legally?) be able to modify it and nor would you be able to in many cases where they try. It's just as proprietary as Microsoft Office. There are proprietary tools to do even more advanced obfuscation on top of minification (adds red herring code paths that do nothing), which some JavaScript malware vendors use to protect their implementations.

What we really want is libre JavaScript/WASM where vendors include permissive licenses and source maps or links to download the high level source. That's free software. The "free and open" web never really existed de jure; publishers' laziness to obfuscate created a de facto free and open web. Libreness depends on access to high level source, not reverseability, or else Photoshop is free too because you can attach a debugger to it.

WASM just exposes the truth that the web was an app store all along.

UncleEntity 2386 days ago

> Is there any tool or standard being worked on to make .wasm files coherent for users who have to run the code to view websites?

You can convert the binary files to/from the text (lisp like) format with readily available tools.

Also, the binary format is easily parsed -- made a parser with katai(sp?) struct in like an afternoon.

mbrock 2386 days ago

Doesn’t that reasoning also imply that GNU and Linux are awfully opaque and not free nor open, since they are typically distributed as binaries?

re-actor 2386 days ago

If the binaries were the only way to access GNU or Linux then yes, obviously. That's not the case now is it.

mbrock 2386 days ago

Right, because when you distribute free software in binary form, you make sure to make the source code available with a copyright license disclaimer allowing redistribution. This applies exactly to WebAssembly software just as it does with Java software. Software freedom is compatible with binary distribution.

neckardt 2386 days ago

No, because GNU and Linux are bound by license restrictions which require the unobfuscated source code to be made available.

If all websites made their source available as well as distributing the binary, there wouldn't be a problem.

eikenberry 2386 days ago

Only if the sites using webassembly binaries also have the source available to download.

dmvinson 2386 days ago

The source for GNU and Linux is viewable by everyone, which negates the inability to view what is happening inside a binary. This is the problem Javascript source maps are meant to solve for the web, and I would welcome WASM more if part of the standard was a requirement for a source map when browser Dev Tools are open.

bogwog 2386 days ago

> This is the problem Javascript source maps are meant to solve for the web

That's not the problem source maps are meant to solve. They exist to debug transpiled code.

> The source for GNU and Linux is viewable by everyone, which negates the inability to view what is happening inside a binary.

That's not true. It is non-trivial to verify that the binary you received was built with the source code that's openly available. The point of FOSS is that you always have the option to build your own binaries so that you can be 100% certain of what is running on your machine. Most people aren't going to do that, so they need to place their trust on a third party (like whoever built their kernel). FOSS just makes that trust optional instead of mandatory (like it is with something like Windows)

LaGrange 2386 days ago

I think it's quite different, because it used to enforce (past tense) availability of the source code, which was quite neat from the consumer perspective. Linux distros had a very different dynamic, so that enforcement was effectively unnecessary.

It stopped working as enforcement ages ago, though, so ️.

mtrower 2386 days ago

Well, for one thing, mature debuggers exist (the equivalent to the tooling he inquired about for .wasm).

More importantly, however, anything GPL must make source available and reasonably accessible. There is no such guarantee or even expectation for random programs on the web.

mbrock 2386 days ago

Indeed, huge amounts of JavaScript that’s already out there does not come with a free software license. The FSF has been complaining about this for years.

As for debugging, this is not a particularly hard or fundamental problem. It’s basically solved already.

https://developers.google.com/web/updates/2019/12/webassembl...

andrewaylett 2386 days ago

My employer quite deliberately publishes source maps alongside our javascript, so people who want to tinker and learn don't lose the ability to see what we're doing, but end users (our press pack says we serve pages to around 1% of people in any given month) don't have to pay the cost of downloading unminified JS for every page load. I suspect that if we ever start using WASM, we'll do the same thing.

Just because you _can_ use the compliation step to (go some way to) hide your source doesn't mean you _have_ to. And relying on your secret sauce being private while you publish it in obfuscated form for all the world to decypher feels like a losing strategy.

download13 2386 days ago

This might actually be a great direction to go.

I don't think there's any particular reason that WASM has to be more obfuscated than JS. You can already throw a WASM file into a bytecode-to-text translator which is about as useful as deobfuscating a minified JS file, and I assume decompiling/debugging tools will only get better in the future.

For a long time now, I've been thinking of a future where your OS properly isolates all the programs that run on it and even gives us the ability to have direct control over how programs interact with the rest of the system. OS's seem too mired in backwards-compatibility requirements to make big changes like that any time soon, but that's basically the way our browsers already work. Download some code, and execute it (relatively) safely because it's sandboxed from the rest of the system. Our browsers are basically the new OS, and this time around we can do it right using what we learned from OS's (and hopefully backport these browser features into the next generation of OS's).

For example, an app asks for a filesystem handle. You can hand it one that refers to a real location on your OS fs, or you can hand it a completely virtual fs that won't affect anything else on the system.

Whenever an app asks for a resource, being able to hand it a virtual or sandboxed one instead is a huge gain for user-control.

voldacar 2386 days ago

Dissassembled webassembly is probably way more readable than minified/packed js

camgunz 2386 days ago

> Distributing opaque binaries with websites instead of Javascript is a step past even the obfuscated minified javascript files meant to be confusing. At least those can still be debugged, stepped through, and explored freely by the end user if they want to learn or reverse engineer.

If you really think this is true, look into Google's recaptcha blob.

throwGuardian 2386 days ago

Why should someone else's website code forced to be "free"?

BlueTemplar 2386 days ago

Because otherwise it's not a website?

"By the end of 1990, the first web page was served on the open internet, and in 1991, people outside of CERN were invited to join this new web community.

As the web began to grow, Tim realised that its true potential would only be unleashed if anyone, anywhere could use it without paying a fee or having to ask for permission.

He explains: “Had the technology been proprietary, and in my total control, it would probably not have taken off. You can’t propose that something be a universal space and at the same time keep control of it.”

So, Tim and others advocated to ensure that CERN would agree to make the underlying code available on a royalty-free basis, forever."

Can you see how rude it is to not do the same ?

throwGuardian 2386 days ago

1. We're well past websites into web apps. You are not entitled to the source code of these apps, like Gmail/GSuite - that are client heavy web apps, with logic, state, custom-IPs & algorithms.

2. The underlying code of the web's infrastructure is available on a royalty-free basis, and shall remain as such!! There's immense benefit in maintaining this equal-opportunity status-quo.

BlueTemplar 2386 days ago

"Web apps" are an oxymoron. Transforming the browser into an OS inside the OS is just bad practice (but the reason that happened was because Microsoft sucked, and Google wanted more control over computing - see also Valve with Steam patching games).