| HN Mirror

> And 5 is a non-solution given that simply adding more lines of execution does not address the root problem.

Actually more threads of execution does solve the problem. The difference with just doubling the number of dynos is that on a single dyno requests can be routed intelligently. The reason why random routing sucks is that request processing times have a fat tailed distribution: there is a small but still significant chance that a request takes really long. If you have that request routed to a random single threaded dyno, then all further requests routed to that dyno have to wait very long before they can be processed. If however you had multiple threads of execution on the dyno, the other requests would simply go to the other thread of execution. So now there would only be blocking if a single dyno gets N really long requests at roughly the same time, where N is the number of concurrent threads the dyno is running. The probability of getting N expensive requests to the same dyno at approximately the same time decreases very fast with increasing N.

Hand waving ahead! Lets say the probability of an expensive request blocking a dyno is p = 2%. Then if you double the number of dynos the probability of blocking a dyno is now p/2 = 1%. If however you have two execution threads on each dyno, the probability of blocking a dyno is now p^2 = 0.01%. If you have 10 execution threads it is p^10 which is very small indeed.

Here is a paper about it which makes that intuition precise and shows that even N=2 is a massive improvement over N=1: http://www.eecs.harvard.edu/~michaelm/postscripts/handbook20...

The problem is that this only works if each concurrent process of your application doesn't use too much memory, since the available memory on one dyno is quite low. For many applications you can't easily have multiple threads of execution on one dyno. The real solution is to have some form of intelligent routing. As the hand waving and the paper above shows, you can make groups of dynos, and then the main router routes to a random group, and within each group requests are routed intelligently. You can take the size of a group to be a small constant, say 10 dynos. So there shouldn't be any scalability problems with this routing approach. If you take the group size small enough, you could even run each group of dynos on a single physical machine, which would make intelligent routing among them even simpler.