Hacker News new | ask | show | jobs
by mrweasel 4057 days ago
My experience is that the worst bots don't respect robots.txt anyway.

Getting crawled by the major search engines typically isn't that bad, they tend to know what they're doing. Getting hammered by some crappy local search engine is what's annoying.

We don't limit any bots, except once where we completely blocked Eniro in our firewall. Google, Bing and a ton of other could index at the same time, with no issue. Eniro for some reason decided to just index way to much at once, no reaction to robots.txt and no reply from the email they so kindly included in the headers.

But I see your point, it's just a bit sad when Google has become "The Internet".

1 comments

I thought FB was the internet. Googlebot is just the Kleenex of indexers.