| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ezy 3839 days ago
	The issue with this is that if you care about speed, you need to be using SIMD -- and wasm doesn't seem to want to support it -- stating a "minimum viable" standard from the year 2000.

3 comments

azakai 3839 days ago

SIMD.js is in progress for JS, so JS VMs are already working on it. It is on the roadmap for being added to WebAssembly, likely with a similar API.

link

zurn 3839 days ago

Is there a technical restriction why JS JITs don't transparently apply SIMD to existing loops like normal compilers (=autovectorization)? Or is it in the cards to build it on top of SIMD.js?

link

yoklov 3839 days ago

Autovectorization is hard and fragile in the best of times -- tight stable loops with no (or few) branches and little/no memory access that may be aliased, etc.

For a dynamic language JIT this is pretty much infeasible (as I understand it). Every loop might have branches for guards/bailing out due to deopts, and at least in JS, TypedArrays are allowed to alias eachother.

link

jacobolus 3839 days ago

Existing Javascript JITs also don’t output SIMD instructions as far as I know.

I expect SIMD support can be added to wasm at some future date.

link

ezy 3839 days ago

I wasn't really referring to wasm vs JS (which is easy to outperform), but reaching equivalent speed to native, which is the ultimate goal, I'd hope. Otherwise you're just throwing performance away...

But, I was a little quick on the trigger. It's not obvious, but after drilling through the design docs for a half hour.. I found a tentative discussion of how SIMD might be added later (based on some kind extension mechanism).

link

nwmcsween 3839 days ago

Why wouldn't llvm be able to emit SIMD when it can vectorize? The idea the SIMD is the end all to fast computing is pretty short sighted IMO.

link

jacobolus 3839 days ago

SIMD isn’t the “end all”, but upcoming Intel chips have instructions for handling 8 doubles or 16 floats per instruction, and if you’re trying to implement a video codec or large-scale physics simulation, an order of magnitude of speed difference can make or break your app.

To take an example where timeliness is crucial, think of the difference between, say, 10 frames per second vs. 60 frames per second in a first-person shooter game.

link