Hacker News new | ask | show | jobs
by rapsey 104 days ago
> Backpressure is built in. If a process receives messages faster than it can handle them, the mailbox grows. This is visible and monitorable. You can inspect any process’s mailbox length, set up alerts, and make architectural decisions about it. Contrast this with thread-based systems where overload manifests as increasing latency, deadlocks, or OOM crashes — symptoms that are harder to diagnose and attribute.

Sorry but this is wrong. This is no kind of backpressure as any experienced erlang developer will tell you: properly doing backpressure is a massive pain in erlang. By default your system is almost guaranteed to break in random places under pressure that you are surprised by.

4 comments

Yes, this is missing the "pressure" part of "backpressure", where the recipient is able to signal to the producer that they should slow down or stop producing messages. Observability is useful, sure, but it's not the same as backpressure.
Sending message to a process has a cost (for purposes of preemption) relative to the current size of receiver's mailbox, so the sender will get preempted earlier. This isn't perfect, but it is something.
Occam (1982 ish) shared most of BEAMs ideas, but strongly enforced synchronous message passing on both channel output and input … so back pressure was just there in all code. The advantage was that most deadlock conditions were placed in the category of “if it can lock, then it will lock” which meant that debugging done at small scale would preemptively resolve issues before scaling up process / processor count.
Once you were familiar with occam you could see deadlocks in code very quickly. It was a productive way to build scaled concurrent systems. At the time we laughed at the idea of using C for the same task
I spreadsheeted out how many T424 die per Apple M2 (TSMC 3nm process) - that's 400,000 CPUs (about a 600x600 grid) at say 1GIPs each - so 400 PIPS per M2 die size. Thats for 32 bit integer math - Inmos also had a 16 bit datapath, but these days you would probably up the RAM per CPU (8k, 16k?) and stick with 32-bit datapath, but add 8-,16-bit FP support. Happy to help with any VC pitches!
David May and his various PhD students over the years have retried this pitch repeatedly. And Graphcore had a related architecture. Unfortunately, while it’s great in theory, in practice the performance overall is miles off existing systems running existing code. There is no commercially feasible way that we’ve yet found to build a software ecosystem where all-new code has to be written just for this special theoretically-better processor. As a result, the business proposal dies before it even gets off the ground.

(I was one of David’s students; and I’ve founded/run a processor design startup raised £4m in 2023 and went bust last year based on a different idea with a much stronger software story.)

Yes David is the man and afaict has made a decent fist of Xmos (from afar). My current wild-assed hope for this to come to some kind of fruition would be on NVidia realising this opportunity (threat?), making a set of CUDA libraries and the CUDA boys going to town with Occam-like abstractions at the system level and just their regular AI workloads as the application. No doubt he has tried to pitch this to Jensen and Keller.
It took me a while to realise that you were responding to the article, not a comment here.

You're right in correcting the article, but I'd like to add that for probably around a decade, Erlang had 'sender punishment', which is what 'IsTom' who replied to you is probably talking about.

Ulf Wiger referred to sender_punishment as "a form of backpressure" (Erlang-questions mailing list, January 2011). 'sender punishment' was removed around 2018, in ad72a944c/OTP14667. I haven't read the whole discussion carefully, but it seems to be roughly "it wasn't clear that sender punishment solved more problems than it caused, and now that most machines are multi-core, that balance is tipped even more in favour of not having 'sender punishment'".

Sender punishment on the same node may be dead, but AFAIK, if the dist connection to a remote node is beyond the backlog threshold, sends will block, which offers some backpressure.

Is that sufficient and/or desirable backpressure, and does it provide everything your app needs? Maybe close enough for some applications?

You can also do some brute force backpressure stuff now; you can set a max heap size of a process and if it uses an on-heap message queue, it should be killed if the queue gets too large. Not very graceful, but create some back pressure.

I'm a fan of letting back pressure accrue by having clients timeout, and having servers drop requests that arrive too late to be serviced within the timeout, but you've got to couple that with effective monitoring and operations. Sometimes you do have to switch to a quick response to tell the client to try again later or other approaches.

I wonder how much the roots of erlang is showing now? Telephone calls had a very specific "natural" profile. High but bounded concurrency (number of persons alive), long process lifetime (1 min - hours), few state changes/messages per process (I know nothing of the actual protocol). I could imagine that the agentic scenario matches this somewhat where other scenarios, eg HFT, would be have a totally different profile making beam a bad choice. But then again, that's just the typical right-tool-for-the-job challenge.