Hacker News new | ask | show | jobs
by jholman 4867 days ago
Re the distribution, absolutely. That "FIFTY TIMES" is totally due to the width of the distribution. Although, you know, even if their app was written such that every single request took exactly 100ms of dyno time, this random routing would create the problem all over again, to some degree.

As for the intelligent routing, could you explain the problem? The goal isn't to predict which request will take a long time, the goal is to not give more work to dynos that already have work. Remember that in the "intelligent" model it's okay to have requests spend a little time in the global queue, a few ms mean across all requests, even when there are free dynos.

Isn't it as simple as just having the dynos pull jobs from the queue? The dynos waste a little time idle-spinning until the central queue hands them their next job, but that tax would be pretty small, right? Factor of two, tops? (Supposing that the time for the dyno-initiated give-me-work request is equal to the mean handling time of a request.) And if your central queue can only handle distributing to say 100 dynos, I can think of relatively simple workarounds that add another 10ms of lag every factor-of-100 growth, which would be a hell of a lot better than this naive routing.

What am I missing?

1 comments

I think the problem is that any servers which can handle concurrent requests now need to decide how many requests they can handle. Since most application servers seem to have concurrency values of "1, ever" or "I dunno, lots" this is a hard problem.

Your solution would likely work if you had some higher level (application level? not real up on Heroku) at which you could specify a push vs. pull mechanism for request routing.

Yeah, I dunno squit about Heroku either.

Given that, according to TFA (and it's consistent with some other things I've read) Heroku's bread and butter is Rails apps, and given that, according to TFA, Rails is single-threaded, that (valid) point about concurrency in a single dyno is perhaps not that relevant? You'd think that Heroku would continue to support the routing model that almost all of their marketing and documentation advertises, right? Even if it's a configurable option, and it only works usefully with single-threaded servers?

And if you did do it pull-based, it wouldn't be Heroku's problem to decide how many concurrent requests to send. Leave it to the application (or whatever you call the thing you run on a dyno).

And it doesn't need to be pull-based, if the router can detect HTTP connections closing in dynos, or whatever.

But the idea of pull-based work distribution is pretty straightforward. It's called a message queue.