| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mrighele 2559 days ago
	It's a pity that robots.txt doesn't let you specify what the crawler can do with the resources it's allowed to fetch. I think that if we had such a feature (or something similar, like a "License" header) standardized early enough , a few issues regarding crawling and search engines would be moot, or at least easier to solve automatically.

1 comments

True but all the commercial websites would use it to ban scraping then.