Hacker News new | ask | show | jobs
by RyanZAG 4842 days ago
The issue is that while the average time is going to converge - users aren't interested in the average time!

Consider the following scenario:

  - Two types of requests, A and B
  - Request A takes 200ms to process
  - Request B takes 10 seconds to process
  - Each dynamo can take 4 concurrent requests
If you are receiving 1000s of requests per minute, it is very likely that you will eventually allocate more than 4 request B to a single dynamo. One this has happened, that dynamo is now locked for 10 seconds. All of the request As that we route to that dynamo will take over 10 seconds to return.

If request A is a credit card transaction, and request B is just some data lookup with a loading UI spinner, then every time this occurs our poor app has lost money as the user has navigated away from our app before the transaction (request A) can complete! Ouch!

Intelligent routing solves this by ensuring that request A will only go to a dynamo that is open, thereby ensuring no user will have to wait 10 seconds for their quick credit card transaction.

The take away here is that the important measure is the slowest user facing request, not the average request time across all requests.

2 comments

How about factoring your A-type requests and B-type requests into two separate apps?

Oh, wait, Heroku already tells you to do this: the B-type requests are why they introduced "workers." Heroku's recommendation has always been that you're supposed to write your "app servers" to serve A-type requests directly, while asynchronously queuing your B-type requests to be consumed by workers. You then either poll the app server for the completion state of the long requests, or have the worker queue a return-value paired to the request-ID, that the app server can dequeue and return along with the next request. (Erlang's process-inbox-based message-passing semantics, basically.)

To put it another way, it's the old adage of "don't do long calculations on the UI thread." In this case we have a Service-Oriented Architecture, so we've got a UI service--but we still don't want it to block. By default, Heroku basically exposes Unix platform semantics; unless you wrap those with Node or Erlang, you have to deal with how Unix does SOA: multiple daemons, and passing messages over sockets. Heroku could "intelligent-route" all they want, but there's no level of magic that can overcome applications designed in ignorance of how concurrent SOA architecture works on the platform you're designing for.

However, the more concurrency your stack can do, the less this matters. Assuming the probability of a node getting assigned a long request is 1/5, the probability of a node getting assigned 4 of them is (1/5)^4 = 1/625. If your stack can do 20 concurrent requests, it's (1/5)^20 = 1/95367431640625.[1]

Note that this overlooks the fact that a node that has requests queued only drains them at a certain rate, as long requests tend to pile up. However, if your framework is truly concurrent and your nodes are not CPU bound, but rather waiting for other services, it's possible to get much higher concurrency than just 20.

[1] http://www.wolframalpha.com/input/?i=%281%2F5%29%5E20

Yes, the "just buy more dynos even when better routing would let you get by with less" strategy. I can see the appeal in this, for Heroku.
You misunderstood my reply. It means that each dyno does more work by being concurrent, not buying more dynos.

It's possible to do several hundred requests per second on a single dyno; just because Rails doesn't allow you do to that due to not being concurrent doesn't mean that other stacks can't.

Rails isn't "not concurrent", it simply uses multiprocess concurrency rather than multithreaded concurrency. Unicorn, for instance, uses a forking model, which is why Heroku recommends it. Ruby 2.0 also favors improves the runtime's copy-on-write performance, which makes it cheaper to fork processes.

Still, there's a resource limit to how much concurrency you can achieve within a dyno as opposed to by buying more dynos. You shouldn't have to scale horizontally just because the routing scheme is inefficient with the width that it has.

If Heroku really wanted to abandon Rails because other web stacks are easier for them to scale with, they should have let us know rather than turning around and shitting on the platform that made them as a business.

Rails is not concurrent. Forking the process does not make it concurrent, either. Parallel, yes, but not concurrent.

Concurrency is the ability to do work on multiple things at once, while parallelism is the ability to execute multiple things at once. For example, an operating system running on a uniprocessor is concurrent, but not parallel. Another example is HAProxy, which is highly concurrent but not parallel(it can handle several thousands connections, but is single-threaded).

The distinction is important when you're trying to scale. Adding more threads/processes does not help, as you quickly reach OS limits(eg: C10k problem). Having a concurrent web stack(nodejs, EventMachine, Play, Xitrum, Twisted, Yaws, etc.) does, as it allows extremely large concurrency with a limited resource impact, whereas adding more processes quickly hits memory limits(at least on Heroku).

By that definition, it's more typical for databases and load balancers to be highly concurrent. If you have that, then all you need from your app servers is parallelism. Unless you have an especially bad load balancer that does something stupid like distribute requests randomly, at which point your app servers have to do some of their own load balancing.

And even at that point, you're still using more dynos than you would with intelligent routing.