Hacker News new | ask | show | jobs
by perryprog 1109 days ago
What’s the benefit of waiting a random amount of time between requests?
1 comments

Some zealous systems will infer a very regular request rate as coming from automated services and block them, no matter how gentle the rate.
Is this speculation, or something that has actually been seen?
I can't say how the website I'm scraping would respond if I just went full throttle. But it's also just a matter of courtesy anyway not to make ten thousands requests per second.

Funny story - at work we once had a huge spike in requests from a single IP. We all crowded around, thinking it was some malicious hacker from France. How exciting - we're now interesting enough to warrant a DoS! Turns out another team in the company was just pulling all our data into Algolia to improve search. They were clearly not very courteous!

So on the other end (building APIs) I certainly do pay attention to traffic and have Grafana alerts set up around it.

I was asking about randomizing the interval between requests.
tends to be around not repeating same patterns.
Boy, this subthread has been exasperating to read.
I would block it, if I was the maintainer. As the linked post mentions, automated requests can "ramp up" at any moment, risking server stability. By preemptively blocking automated parsing (on a resource which primary usage is individual requests, not mass ones) I would avoid future problems for myself. Let them contact us via support if they really need an exception.

In general I would rate limit by IP anything connected to the internet.

So the answer is it's just speculation and has never been seen in the wild :)
Kind of a snarky response. Obviously this has been seen in the wild. If you created an intrusion detection system to look for suspicious requests, I think one occurring over and over and at a regular interval would clearly be seen as malicious and not a genuine user.
> Obviously this has been seen in the wild.

Can you provide an actual example? I see it come up a lot in these conversations, but I'm really skeptical that anyone actually does this analysis. It seems like rate analysis (either requests or bandwidth) would achieve the same result in a far simpler manner, so I suspect that is what actually happens.

Sure, you can interpret my answer like that, if that makes you happy.