| HN Mirror

Let's assume that it is unacceptable to have each dyno tell the router each time it finishes a request ( http://news.ycombinator.com/item?id=5217157 ). And also that the goal is to reduce worst-case latency. And also that we don't know a priori how many requests each dyno should queue before it is 'full' and rejects further requests ( http://news.ycombinator.com/item?id=5216771 ).

Proposal:

1) Once per minute (or less often if you have a zillion dynos), each dyno tells the router the maximum number of requests it had queued at any time over the past minute.

2) Using that information, the router recalculates a threshold once a minute that defines how many queued requests is "too many" (e.g. maybe if you have n dynos, you take the log(n)th-busiest-dyno's load as the threshold -- you want the threshold to only catch the tail).

3) When each request is sent to a dyno, a header fields is added that tells the dyno the current 'too many' threshold.

4) If the receiving dyno has too many, it passes the request back to the router, telling the router that it's busy ( http://news.ycombinator.com/item?id=5217157 ). The 'busy' dyno remembers that the router thinks it is 'busy'. The next time its queue is empty, it tells the router "i'm not busy anymore" (and repeats this message once per minute until it receives another request, at which point it assumes the router 'heard').

5) When a receiving dyno tells the router that it is busy, the router remembers this and stops giving requests to that dyno until the dyno tells it that it is not busy anymore.

I haven't worked on stuff like this myself, do you think that would work?