|
|
|
|
|
by buo
1466 days ago
|
|
I think this paragraph on the difficulty of building good independent indexes should not be overlooked. What's going on with Cloudfare? > When talking to search engine founders, I found that the biggest obstacle to growing an index is getting blocked by sites. Cloudflare is one of the worst offenders. Too many sites block perfectly well-behaved crawlers, only allowing major players like Googlebot, BingBot, and TwitterBot; this cements the current duopoly over English search and is harmful to the health of the Web as a whole. |
|
It does depend on the sites' settings though. Some are set to block all bots, and then you're kinda out of luck.
In general, I've found that like 99% of the problems you might encounter running a bot can be solved by just finding the right person and sending them an email explaining your situation. In almost all cases, they'll let you through.
[1] https://blog.cloudflare.com/friendly-bots/