Long ago I was responsible for implementing a “rate limiting algorithm”, but not for HTTP requests. It was for an ML pipeline, with human technicians in a lab preparing reports for doctors and in dire cases calling their phone direct. Well my algorithm worked great, it reduced a lot of redundant work while preserving sensitivity to critical events. Except, some of the most common and benign events had a rate limit of 1 per day.
So every midnight UTC, the rate limit quotas for all patients would “reset” as the time stamp rolled over. Suddenly the humans in the lab would be overwhelmed with a large amount of work in a very short time. But by the end of the shift, there would be hardly anything left to do.
Fortunately it was trivial to add a random but deterministic per patient offset (I hashed the patient id into a numeric offset).
That smoothly distributed the work throughout the day, to the relief of quite a few folks.
Long ago I was responsible for implementing a “rate limiting algorithm”, but not for HTTP requests. It was for an ML pipeline, with human technicians in a lab preparing reports for doctors and in dire cases calling their phone direct. Well my algorithm worked great, it reduced a lot of redundant work while preserving sensitivity to critical events. Except, some of the most common and benign events had a rate limit of 1 per day.
So every midnight UTC, the rate limit quotas for all patients would “reset” as the time stamp rolled over. Suddenly the humans in the lab would be overwhelmed with a large amount of work in a very short time. But by the end of the shift, there would be hardly anything left to do.
Fortunately it was trivial to add a random but deterministic per patient offset (I hashed the patient id into a numeric offset).
That smoothly distributed the work throughout the day, to the relief of quite a few folks.