Hacker News new | ask | show | jobs
by zbentley 8 days ago
Not the case; good abstractions are valuable, but the performance differences between runtimes are very real.

Take the example of some simple HTTP<->blob store service gets slammed with millions of requests when someone using the API does a backfill via some framework on their end that aggressively scales request volume up and out.

Something like, say, async Python/starlette with a coroutine per request is gonna perform slightly worse than Erlang, which in turn is gonna perform much worse than Go.

You're right that those differences are sometimes marginal when the latency of whatever IO the backend's doing dominates the equation. However, in my experience huge volume surges show issues with the runtime (the thing managing/launching multiplexed request handler routines) or the ecosystem (the backend IO libraries' ability to work with the runtime's IO multiplexing and make things like request coalescing easy or automatic) more often than you'd think.

It really takes surprisingly little volume to cripple a return-hello-world Phoenix app that indirects the "hello world" behind way too much middleware and message passing; it takes even less to kick over, say, a Gunicorn instance returning "hello world" at the bottom of the Django middleware stack. Golang with Gin, on the other hand, is surprisingly hard to cripple in the same way. And I say that as someone who likes Elixir and Python a lot more than I like Go!

2 comments

Thank you. As a guy who made a career out of Elixir (and begins to regret it recently but oh well) I agree that Elixir's throughput is not amazing. However, it can get very far and we should always optimize for the most common usages.

I've personally rewritten one hobby and one professional projects from Elixir to Golang and loved the result; as you said, extremely difficult to bring down a Golang service to its knees.

One clarification: Phoenix server behind Caddy/nginx fairs better btw. But, details. Your point stands.

I am yet to see a Rust web/API service I wrote to _ever_ buckle under pressure and just crash. It was either an application bug (like the famous Cloudflare's `.unwrap()` error from the last weeks/months) or the Linux OOM killer. Literally never crashed. But I did witness it brutally murder a MySQL cluster because it couldn't serve it fast enough. That was both fun and terrifying to watch on the dashboards.

> I did witness it brutally murder a MySQL cluster because it couldn't serve it fast enough. That was both fun and terrifying to watch on the dashboards.

Haha yep. In my experience, everyone running CGI/process-per-request application servers is bullish on switching to a concurrent or cooperative runtime...until they realize they just removed the primary ratelimiter on downstream DB/service accesses.

The converse war stories are also amusing: people rewrite their whole app in a concurrent/asynchronous framework and nothing changes, because the DB driver is still farming out all queries to a tiny fixed-size threadpool of connections that was the bottleneck all along.

Oh yeah, definitely. If your DB server (or any storage backend) cannot have like 200+ connections alive at all times then it's absolutely pointless rewriting your app in Elixir or Golang. You'll just serve DB timeouts in your responses.
> You're right that those differences are sometimes marginal when the latency of whatever IO the backend's doing dominates the equation. However, in my experience huge volume surges show issues with the runtime (the thing managing/launching multiplexed request handler routines) or the ecosystem (the backend IO libraries' ability to work with the runtime's IO multiplexing and make things like request coalescing easy or automatic) more often than you'd think.

fair enough, although at this point we start talking about LB in front of the thing, consumption mechanics, autoscaling signals

i will still maintain that my simple advice for a dev worrying about scale, is that they should focus their efforts on ensuring downstream IO doesn't get overwhelmed (db read replicas, caching, etc) before optimizing runtime performance or autoscaling out unnecessarily.

> focus their efforts on ensuring downstream IO doesn't get overwhelmed (db read replicas, caching, etc) before optimizing runtime performance or autoscaling out unnecessarily.

All good advice, but the choice of runtime can affect the point at which autoscaling and load balancing even need to enter the conversation at all. Optimizing, say, a mostly in-memory cache service and writing it in Golang may yield results like "we can run a single instance of this and serve three orders of magnitude of business growth; slap it behind a DNSRR or a k8s NodePort for update/replacement/fast failover if it crashes, no complex load balancer needed", where writing the same thing in, say, PHP might require discussing orchestration/load balancing/memory/worker process recycling/autoscaling early on in the service's lifetime. Being able to skip those conversations (entirely or for a long time) is a very significant business benefit.