Hacker News new | ask | show | jobs
by zubspace 3282 days ago
I'm not an expert in this field, but an engineer of vimeo went into detail, why this approach did not work for them. [1]

Problem with consistent hashing:

  However, consistent hashing comes with its own problem: uneven distribution of requests.
  Because of its mathematical properties, consistent hashing only balances loads about as
  well as choosing a random server for each request, when the distribution of requests is
  equal. But if some content is much more popular than others (as usual for the internet),
  it can be worse than that.
Problem with Power of 2 Load Balancing:

  Why wasn’t there a way to say “use consistent hashing, but please don’t overload any
  servers”? As early as August 2015, I had tried to come up with an algorithm based on
  the power of two random choices that would do just that, but a bit of  simulation said
  that it didn’t work. Too many requests were sent to non-ideal servers to be worthwhile.
Instead, he used something called Consistent Hashing with Bounded Loads.

[1] https://medium.com/vimeo-engineering-blog/improving-load-bal...

3 comments

Bounded load consistent hashing is interesting. It makes total sense that 2 random wouldn't work for vimeo, it actually doesn't work for us between visitors and our edge because we care a lot about cache data — we only use it between our load balancers and our customer's origin app instances.

For what we're doing, we actually need to consider more than just load. Since our LBs are distributed globally, we also want to make sure we're sending requests to backends that are geographically near them.

We can do this by tracking latency between the load balancer and origin servers, then using it to restrict the candidate pool we're going to choose two from at random.

They are different algorithm for different purpose.

Consistent hashing is used to always attach a request to the same host. It's the opposite of load balancing.

Load balancing algorithms (least connection, business, etc...) are used to distribute requests across servers as well as possible to maximize performances.

Usually both properties are desirable .. up to a point.

You want to minimize load on all servers, but you also want to pack things up efficiently (so minimize operational costs), but of course you want the benefits of caching, so you want requests from a sessions to land on the same node/server/box.

Basically a multi-dimensional optimization problem. Completely solvable with constraints. Let the business people decide what's more important, latency or throughput or low cost of operations.

It looks as though the approach proposed in the article is random, rather than attempting to use consistent hashing (as Vimeo investigated) - that may be why the results the Vimeo engineers found are worse than those the artcile suggests?