Hacker News new | ask | show | jobs
by LinuxBender 1863 days ago
I can't speak for Cloudflare, but crawling speed should be dictated by the site owner via the robots.txt crawl-delay. [1] A site owner could also rate-limit unauthenticated requests by IP via the cloudflare header using a 429 too many requests error page.

[1] - https://en.wikipedia.org/wiki/Robots_exclusion_standard#Craw...

1 comments

This here is the problem. It’s a new time no one wants to be Rfc compliant, just go behind a service and problem is solved.

So no problem, time to move on web search is no longer exciting