Hacker News new | ask | show | jobs
by joelby37 1253 days ago
I've been experiencing exactly the same thing over a number of sites for the last few days too, with exactly the same user agent in this post. The annoying part is that it is not really 'spidering' web sites, but rather continuously hammering a list of non-existent pages which appear to be from a years-old version of the site.

Generating 404 responses puts a considerable load on WordPress sites and generates a lot of network traffic, but these have been relatively easy to block because of the predictable user agent and URI path prefixes. I'm thinking about blocking the Azure ASN completely, or developing something akin to Cloudflare's "are you a human?" interstitial page when requests come from cloud provider ASNs.

1 comments

Yes - same here!

Started around Jan 12. Large pool of IP addresses, hard to block. Occasional brief DOS impacts but mostly just annoying errors in my logs. (if too many crop up we get automatic alerts). What was really puzzling is that many of the URLs are old (e.g. request for details on hosted sites that no longer exist). I loaded up a 6 month old backup database and confirmed those accounts weren't present, so the source list of URLs must be older that. Really bizarre.

After reading this article I looked and confirmed via spot checks they are from Microsoft IPs and Safari 15.1.