| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by drchickensalad 3357 days ago
	You have to send all the traffic from one client to one server then right? Seems not without heavy drawbacks. Otherwise you can easily get them to hit their limit super early with bad luck.

1 comments

joneholland 3357 days ago

No, you evenly round robin all traffic to all servers.

Each server contains a map of tokens per per client filling at a fixed interval.

That interval is calculated by taking the total global token refresh rate and dividing it by the number of servers.

The end result is exactly the same but, now you are stateless and have eliminated the bottleneck of a central token bucket.

link

hyperpape 3357 days ago

Wait, each client does its own round robin (if you have three servers, I will hit 1 then 2 then 3)? Is that common?

link

joneholland 3357 days ago

The client doesn't do it. You put your front ends behind a load balancer like an ELB, or use a reverse proxy like Nginx.

Edit: And yes, round robin is the most commonly used load distribution technique, and works very well assuming each request has a roughly equivalent unit of work cost.

link

hyperpape 3356 days ago

I'm surprised you wouldn't run into cases where the requests being rate-limited can't end up unevenly distributed between servers. There are assumptions you could add that would make that not a problem, but I'm surprised they'd hold.

link