Hacker News new | ask | show | jobs
by philwelch 4841 days ago
Random load balancing still wouldn't be optimal because, on aggregate, the dynos that happen to receive more expensive requests still get overutilized and the dynos that receive less expensive requests still get underutilized. This is assuming--probably validly--that there's a power law distribution to how expensive requests are.

Maybe you need entirely different load balancing strategies for different designs of web application, which means Heroku's promise of a single infrastructure stack for everything is bogus. But I'm skeptical that they deliberately chose to favor evented frameworks rather than choosing what was easiest for them to implement.

1 comments

Your front end should not be doing anything "expensive," ever. If you're doing something like transcoding video in your frontend, your performance will go down the drain, for obvious reasons.

If your web app can only process one thing at a time, everything is expensive. Someone made a database query in the admin interface that runs for 20 seconds? Oops, hope you have other servers. Ten 100ms requests queued? The last guy has to wait 1000ms for his reply.

If it can process things concurrently, it doesn't matter, as long as you're not doing something that uses all the CPU(which shouldn't happen, assuming a sane design). Someone made a database query in the admin interface that runs for 20 seconds? Doesn't matter, you still have n-1 database connections to process the other queries. Ten 100ms requests hit your server? That's fine, the last guy will have to wait maybe 120ms for his reply.

If your app cannot process things concurrently, it should not accept connections for further work if it is working on something, period. This shifts the load balancing back onto the load balancer, because it has to find a worker that can process a request. And guess what, random assignment works fine in that case.

If only Heroku allowed dynos to stop accepting further requests rather than queuing at the dyno level!

I don't see how forking is insufficient concurrency for the cases you mention, either. Even the Ruby GIL will allow threads to switch while blocking on I/O, so your ten second DB query is covered.

But there's no free lunch--if 2000ms of CPU ends up on one of your servers, and your median request is closer to 100ms, the unlucky server that gets the 2000ms request is still a little fucked without any mechanism to counterbalance. Even letting the server stop accepting new requests from the LB, as you suggested, would be sufficient.

> If only Heroku allowed dynos to stop accepting further requests rather than queuing at the dyno level!

It does and it's in the configuration for your app, not on the Heroku side of things. For example, if you're using Unicorn, check the documentation for :backlog, which controls the connection queue depth of the web server.[1]

> I don't see how forking is insufficient concurrency for the cases you mention, either. Even the Ruby GIL will allow threads to switch while blocking on I/O, so your ten second DB query is covered.

Can it handle 10, 20 or 100 long database connections/web service queries? I do admit I don't know much about Ruby threading and per-thread overheads and in that particular case, green threads are perfectly fine.

> But there's no free lunch--if 2000ms of CPU ends up on one of your servers, and your median request is closer to 100ms, the unlucky server that gets the 2000ms request is still a little fucked without any mechanism to counterbalance. Even letting the server stop accepting new requests from the LB, as you suggested, would be sufficient.

There shouldn't be any requests that take that much CPU time, unless you're doing something ridiculous like sorting millions of integers or whatnot. If your application requires sorting millions of integers or any computationally expensive requests, for some reason, those should be handled in a backend queue and sent to the client using long polling/comet/websockets/meta refresh so that your front end can worry about delivering pages quickly and shoveling data between the backend and the client instead of crunching numbers.

[1] http://unicorn.bogomips.org/Unicorn/Configurator.html#method...