Hacker News new | ask | show | jobs
by dillondoyle 1912 days ago
Isn't that the website owners right though? I'm not sure I understand the problem here.

If Google is taking traffic and reducing revenue, a company can deny in robots.txt. Google will actually follow those rules - unlike most others that are supposedly in this 2nd class.

2 comments

Yup, no problem here, was just making an observation about how common such blocking was (and about the fact that some people were upset at being crawled by someone other than Google, despite not blocking them).

The company did respect robots.txt, though it was initially a bit of a struggle to convince certain project managers to do so.

> Isn't that the website owners right though?

No. The internet is public. Publishers shouldn't get any say in who accesses their content or how they do it. As far as I'm concerned, the fact that they do is a bug.

No, it's not. I can setup a login page and keep you out if I want. And I can do it however I want.
But your login page will be public and subject to being crawled.
My server, my rules.