Hacker News new | ask | show | jobs
by JohnFen 1025 days ago
> as clear intent in robots.txt would show it is not.

Would it?

The primary purpose of robots.txt is not actually to lock out bots (that's why respecting it is not mandatory). It's to give the bots guidance as to which parts of your site are appropriate for them and which parts are not.

This may make the "clear intent" argument weak in court.

1 comments

> This may make the "clear intent" argument weak in court.

The standard has the keyword "Disallow", not "Avoid". I can't speak for anyone else of course, but that seems a pretty clear indicator of intent to me. By that I mean a site's stakeholders want to indicate that certain bots are disallowed from crawling a portion of their website.

But you and I aren't judges in a court. They go by different rules, such as the official intent and meaning of the robots.txt system itself.

I'm not saying a court wouldn't find intent signaled, I don't know, only that it's not clear-cut that it would.

Aside from whether intent is signalled or not, I would imagine courts may want to identify whether it is reasonable for any particular intent to not be honored in any particular set of circumstances. As you say, robots.txt isn't mandatory to be honored by itself. Perhaps other things might make it so? I don't know.