Hacker News new | ask | show | jobs
by waithuh 1056 days ago
Heavily disagree. I own the server, thus the website. I should be able to allow or disallow any type of web crawler/scraper i want. Similar to how you cant easily regulate whats in a website without lawsuits and takedowns, you cant regulate how discoverable a website is.
1 comments

> I should be able to allow or disallow any type of web crawler/scraper i want.

You're certainly allowed to try, but I don't see why indexers should be mandated to collaborate with you. They serve their users, not you.

Will their users appreciate that they disregard the intent of the authors of what they index?

I mean, "allow" or "regulate" don't _really_ apply here - there was never any enforcement regime around robots.txt, just a convention based on the general expectation that you don't claim ownership of whatever passes your line of sight.