Hacker News new | ask | show | jobs
by gpderetta 441 days ago
It is 2025, multithread scalability is a well understood, if not easy, problem.

The reality is that hardware-provided cache coherence is an extremely powerful paradigm. Building your application on top of message passing not only gives away some performance, but it means that if you have any sort of cross thread logical shared state that needs to be kept in sync, you have to implement cache coherence yourself, which is an extremely hard problem.

With my apologies to Greenspun, any sufficiently complicated distributed system contains an ad-hoc, informally-specified, bug-ridden, slow implementation of MESI.

But of course, if you have a trivially parallel problem, rejoice! You do not need much communication and shared memory is not as useful. But not all, or even most, problems are trivially parallel.

2 comments

"It is 2025, multithread scalability is a well understood, if not easy, problem."

In the 1990s it became "well known" that threading is virtually impossible for mere mortals. But this is a classic case of misdiagnosis. The problem wasn't threading. The problem was a lock-based threading model, where threading is achieved by identifying "critical sections" and trying to craft a system of locks that lets many threads run around the entire program's memory space and operate simultaneously.

This becomes exponentially complex and essentially infeasible fairly quickly. Even the programs of the time that "work" contain numerous bombs in their state space, they've just been ground out by effort.

But that's not the only way to write threaded code. You can go full immutable like Haskell. You can go full actor like Erlang, where absolutely every variable is tied to an actor. You can write lock-based code in a way that you never have to take multiple simultaneous locks (which is where the real murder begins) by using other techniques like actors to avoid that. There's a variety of other safe techniques.

I like to say that these take writing multithreaded code from exponential to polynomial, and a rather small polynomial at that. No, it isn't free, but it doesn't have to be insane, doesn't take a wizard, and is something that can be taught and learned with only reasonable level of difficulty.

Indeed, when done correctly, it can be easier to understand that Node-style concurrency, which in the limit can start getting crazy with the requisite scheduling you may need to do. Sending a message to another actor is not that difficult to wrap your head around.

So the author is arguably correct, if you approach concurrency like it's 1999, but concurrency has moved on since then. Done properly, with time-tested techniques and safe practices, I find threaded concurrency much easier to deal with than async code, and generally higher performance too.

I see it the other way. I’ll admit that I do a lot of “embarrassingly parallel” problems where the answer is “Executor and chill” in Java. I have dealt with quite a few Scala systems that (1) didn’t get the same answer every time and (2) got a 250% speed up with 8 cores and such, and common problems where “error handling with monads theater”, “we are careful about initialization but could care less about teardown (monads again!)” [1] and actors.

The choice is between a few days of messing around with actors and it still doesn’t work and 20 minutes rewriting with Executors and done. The trick with threads is having a good set of primitives to work with and Java gives you that. In some areas of software the idea of composing a minimal set of operations really gets you somewhere, when it comes to threads it gets you to the painhouse,

I went through a phase of having a huge amount of fun writing little server/clients with async Python but switched to sync when the demands in CPU increased. The idea that “parallelism” and “concurrency” aren’t closely related is a bad idea like the alleged clean split between “authentication” and “authorization” —- Java is great because it gives you 100% adequate tools that handle parallelism and concurrency with the same paradigm.

[1] You could do error handling and teardown with monads but drunk on the alleged superiority of a new programming paradigm many people don’t —- so you meet the coders who travel from job to job like itinerant martial artists looking for functional programming enlightenment. TAOCP (Turing) stands the test of time whereas SICP (lambda calculus) is a fad.

I'm a huge fan of the actor model and message passing, so you do not have to sell it to me; I also strongly dislike the current async fad.

But message passing is not a panacea. Sometimes shared mutable state is the solution that is simplest to implement and reason about. If you think about it what are database if not shared mutable state, and they have been widely successful. The key is of course proper concurrency control abstractions.

"There's a variety of other safe techniques."
> multithread scalability is a well understood, if not easy, problem

As someone fairly well versed in MESI and cache optimization: it really isn't. It's a minority of people that understand it (and really, that need to).

> Building your application on top of message passing not only gives away some performance

This really isn't universally true either. If you're optimizing for throughput, pipelining with pinned threads + message passing is usually the way to go if the data model allows for it.

To be clear, I'm not claiming any universality. Quite the contrary, I'm saying that there is no silver bullet.