Hacker News new | ask | show | jobs
by first_amendment 3221 days ago
This isn't about "paternalistic caution," it's about purposeful and sensible engineering. JavaScript wasn't designed for threading, and it's a less natural fit for it compared to a language like C, so this is understandably a pretty serious undertaking.

Your point is that this may enable some currently unimplementable application in JavaScript. The assumption is that Firefox couldn't have existed without a free-for-all shared state thread model, which is very likely false.

I ask you, what applications does free-for-all shared state make possible that SharedArrayBuffer doesn't make possible?

2 comments

Imagine having an app that wants to perform a concurrent search on its model while the view and controller keep chugging on the main thread. SharedArrayBuffer would mean that all of your state has to be in an array of primitives. I’d rather use objects, classes, strings, etc. without having to serialize them all the time.

JS is actually a better fit for threading than C, and in many ways it has similar problems. Unlike C, JS has memory safety and concurrency wouldn’t break that. Concurrent programming is a lot easier if you can’t corrupt the heap. Like C, JS has some “global” variables like RegExp.lastMatch (C has errno) that need to be made into thread-locals. My proposal includes thread locals so it would be easy to make lastMatch into a getter that loads from a thread local.

For your parallel search example, the data set has to be extremely large for parallel searching to have a significant improvement.

When does a client-side JS app have access to many GBs of local data that would justify a parallel algorithm? It seems exceedingly rare but maybe you can imagine an example.

If you're talking about a server side app, if your goal is speed, why would you choose JS over C++? It seems more sensible to write the parallel database search in C++ in that case.

As for appropriateness of threading for C over JS: I think the fact that JS is garbage collected makes a threading implementation a nightmare. A naive GC implementation otherwise kills performance: imagine running a parallel computing and having to "stop the world." GC at a conceptual level is inherently "single-threaded" and it will always be a bottleneck in one way or another.

Not parallel searching. Concurrent searching.

The data set only has to be large enough that the search takes more than 1/60th of a second. Then it's profitable to do it concurrently.

GC is not single threaded at all. In WebKit, it is concurrent and parallel, and already supports mutable threads accessing the heap (since our JIT threads access the heap). Most of the famous high-performance GC implementations are at least parallel, if not also concurrent. The classic GC algorithms like mark-sweep and semi-space are straight-forward to make work with multiple threads, and they both have straight-forward extensions that support parallelism in the GC and concurrency to the mutator.

JavaScript can already do concurrent searching. Concurrent is logical, parallel is physical.

Efficient parallel GC is non-trivial to implement. In the most common implementation, you have to pause all threads before you can collect. That will often negate the performance benefits of having independent parallel threads running, especially if they are thrashing the heap with shared objects as you suggest.

Many factors and capabilities went into Firefox's success. While it's easy to enumerate the primitives required in hindsight, I'm doubtful that if OSes of the period had taken a restrictive stance based on contemporary ideas of what should be allowed, that an easy time would have been had.

This is not hypothetical, consider the present. While Firefox on iOS exists, it's just a branding skin over WebKit, due to a similar flavor of security paternalism around JITing code (only bad people write self modifying code :-). If Firefox had needed to differentiate itself originally in such a market, it's doubtful it would have had much success.

A threading free-for-all may be the wrong abstraction to use for many applications, but it has the virtue of being a decent stand-in for the hardware's actual capabilities. It's also close enough to ground truth that most other abstractions can be built on top of it. Imagine how unpleasant building a browser on top of Workers + ArrayBuffer transfer would be (especially given the lousy multi-millisecond latency of postMessage in most browsers). Also, consider that while there is often loud agreement that raw threads are dangerous, after decades of research, there's little consensus on the "right" option amongst many alternatives.

SharedArrayBuffer is nearly as powerful as the proposal, but not quite. For example, while it allows writing a multi-threaded tree processing library, it would have trouble exposing an idiomatic JS API if the trees in the library live in a single SAB (as JS has no finalizers etc. to enable release of sub-allocations of the SAB). The options are either one SAB per tree (which likely performs badly), an API where JS users need to explicitly release trees when done with them, or leaking memory. With the proposal, each tree node could be represented directly as a JS object. The proposal may not be the best way to fix this problem, but we definitely still have gaps in JS concurrency.

Agreed this would be a serious undertaking, however, and not to be lightly considered.

The proposal goes a long way to make the case this can be implemented performantly, but some deep thought should go into how it would alter / constraint future optimizations in JS JITs.

As it stands now, adding threading to JS has a negative expected value. There is more potential downside than potential upside. It's illogical and irrational to undertake the effort under those conditions.

This should be an industry driven decision. Wait for the users of SAB to say it's not meeting their needs, and for them to provide clear reasons why (not hypothetical limitations, not vague falsely-equivalent comparisons to Firefox). Then we can tangibly weigh the pros against the cons.

Right now this is a solution looking for a problem. Your analogy comparing the JS runtime to iOS runtime isn't appropriate, no single company controls the web platform. Mozilla or Google or Apple or Microsoft can push for JS threads if the arguments for it make sense. Compare to WebAssembly.

In fact the evolution of WebAssembly is a good example of how this ought to happen. Imagine if the creator of emscripten opted to instead first propose a new VM/IL for the web? It would never happen because JS was already good enough. It was more natural to use JS first then create the VM with the goal of addressing the limitations encountered with the JS approach.

Let the tangible shortcomings of SAB bubble to the surface. Then we can sensibly design something that effectively addresses those shortcomings. Not a pattern-matched solution looking for a problem.