| > If only Heroku allowed dynos to stop accepting further requests rather than queuing at the dyno level! It does and it's in the configuration for your app, not on the Heroku side of things. For example, if you're using Unicorn, check the documentation for :backlog, which controls the connection queue depth of the web server.[1] > I don't see how forking is insufficient concurrency for the cases you mention, either. Even the Ruby GIL will allow threads to switch while blocking on I/O, so your ten second DB query is covered. Can it handle 10, 20 or 100 long database connections/web service queries? I do admit I don't know much about Ruby threading and per-thread overheads and in that particular case, green threads are perfectly fine. > But there's no free lunch--if 2000ms of CPU ends up on one of your servers, and your median request is closer to 100ms, the unlucky server that gets the 2000ms request is still a little fucked without any mechanism to counterbalance. Even letting the server stop accepting new requests from the LB, as you suggested, would be sufficient. There shouldn't be any requests that take that much CPU time, unless you're doing something ridiculous like sorting millions of integers or whatnot. If your application requires sorting millions of integers or any computationally expensive requests, for some reason, those should be handled in a backend queue and sent to the client using long polling/comet/websockets/meta refresh so that your front end can worry about delivering pages quickly and shoveling data between the backend and the client instead of crunching numbers. [1] http://unicorn.bogomips.org/Unicorn/Configurator.html#method... |