| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jively 3109 days ago

> A better approach is to use a “set-then-get” mindset, relying on atomic operators that implement locks in a very performant fashion, allowing you to quickly increment and check counter values without letting the atomic operations get in the way.

In a highly distributed system you’d probably want to avoid a centralised data store altogether for fast moving data like rate limits. CRDTs and bucket weighting might be a more effective strategy.

The article states that tracking per-node could cause a problem with race conditions but that assumes it’s the counter that’s the problem. If the node cluster is aware of the other nodes and the relative load of the cluster, you can use this value to weight the isolated rate limiter and the only data that needs to be shared can be broadcast between the nodes using a pub/sub mechanism.

If some variance is permitted (+/- a margin either side of the limit) then having the nodes synchronise their “token weight” based on the size of the cluster means that the nodes can then manage the rate limit in-memory without ever needing to track in a data store.

It does trade-off accuracy, but for accuracy you can then revert to the set-then-get centralised counter, the trade-off being performance because of increased round trip time to the day store.

In most rate limit scenarios, at least from what we’ve seen, extreme accuracy isn’t usually that important vs. being able to scale amd rate limit without having to also scale a data layer to handle the counters.