Hacker News new | ask | show | jobs
by drdaeman 4654 days ago
Sort of, but in machine-readable form and under well-known location (like /robots.txt) so you could read and comply with them before you access the site.

As for those exact terms, I suspect (IANAL) those exact terms prohibit almost any access to the site, as, for example, they forbid any programmatic access to obtain the information, and I haven't heard of any non-software user-agent implementations.

1 comments

You can translate "programmatic" as "automated" as in "someone coded a program/tool to, in a programmatic way, access the website and retrieve the data"

As opposed to a human being in a non-programmatic way, opening his browser and accessing the website.

What's so hard about it?

> someone coded a program/tool to, in a programmatic way, access the website and retrieve the data

Doesn't, for example, Firefox, perfectly fit this description? Yes, I do manually enter the base URL to access, but if that's the distinctive feature...

> As opposed to a human being in a non-programmatic way, opening his browser and accessing the website.

... then manually typing in ./scrape.py www.att.com is non-programmatic, too. :)

Or, maybe, I'm not getting the correct meaning of "automated" due to bad English comprehension and false analogies from other languages. But I always thought every request on the Internet is automated and done by some kind of hardware+software combo, so forbidding "programmatic" access is complete nonsense (access control and rate-limiting are the proper solutions).

(And, if that matters, author of scrape.py does not need to conform to AT&T's TOS if s/he don't actually use the script by themself.)