Hacker News new | ask | show | jobs
by timothycrosley 3355 days ago
I would think if you have a consumer application that can't handle double what is set as the rate limit during a very small corner case (start and end of the the minute barrier) you have bigger problems. As you're still effectively enforcing your rate limit over time with that approach. This just sounds like micro-optimization at its worst.
3 comments

Agreed. Especially given that these rate limits seemed to be aimed at stopping something catastrophic like spammers using 100x allotted capacity, a 2x innacuracy shouldn't really matter. The solution was interesting, however, and I could see it being useful in a situation where users are expected to run very close to their rate limits. For example, I could see AWS being fairly careful about not letting anyone use more than their allotted compute/network bandwidth because getting double bandwidth without paying for it is a pretty big deal.
Yeah, I was also thinking how meaningful ~20MB of memory use really would be in this context. Or how badly would racy token bucket perform in the real world. Still, enjoyed the read.
I think this is an important point. Trying to store all of these in RAM means you can only have so many. Which is why I really like something that can use a backing store of a more cost efficient DB. Once you start thinking about what you could do if you could have 1000s of rate limits per user you end up thinking of lots of interesting ways to use them. Like limiting how often you log/track-usage to 1/hr per event per user. That's saved me a ton of money.

Second thought: token buckets have a nice property of being really cacheable once they expire. You can push down a "won't refill until timestamp" and then clients can skip checking altogether.

The racy code can behave very poorly based on some tests I did of my very first attempts at this !
Had we not discovered the attack, we could have faced a huge surge in our delivery costs and a decline in our email sender reputation.

This isn't just about the application being able to handle double the amount of load, it about keeping costs down and preventing someone from drastically increasing your bill. I work at a fintech company, and many of the 3rd party providers we use have a non-negligible cost per API call. You wouldn't want to wake up one morning and find that someone triggered one of those calls thousands of times while you were asleep.

but again, this is not something that allowing double calls within a minute on minute boundaries actually makes any difference on. Someone would have to be abusing it more then just a little bit for you to know to take action in either setup by the same degree. IE you can't investigate every single time one customer goes over limits by mistake if you have any sizable number of customers.