|
|
|
|
|
by jhspaybar
4867 days ago
|
|
I can chime in here that I have had similar experiences at another large scale place :). Some requests would take a second or more to complete with the vast majority finishing in under 100MS. A solution was put in place that added about 5 MS to the average request, but also crushed the long tail(it just doesn't even exist anymore) and everything is hugely more stable and responsive. |
|
Proposal:
1) Once per minute (or less often if you have a zillion dynos), each dyno tells the router the maximum number of requests it had queued at any time over the past minute.
2) Using that information, the router recalculates a threshold once a minute that defines how many queued requests is "too many" (e.g. maybe if you have n dynos, you take the log(n)th-busiest-dyno's load as the threshold -- you want the threshold to only catch the tail).
3) When each request is sent to a dyno, a header fields is added that tells the dyno the current 'too many' threshold.
4) If the receiving dyno has too many, it passes the request back to the router, telling the router that it's busy ( http://news.ycombinator.com/item?id=5217157 ). The 'busy' dyno remembers that the router thinks it is 'busy'. The next time its queue is empty, it tells the router "i'm not busy anymore" (and repeats this message once per minute until it receives another request, at which point it assumes the router 'heard').
5) When a receiving dyno tells the router that it is busy, the router remembers this and stops giving requests to that dyno until the dyno tells it that it is not busy anymore.
I haven't worked on stuff like this myself, do you think that would work?