Hacker News new | ask | show | jobs
by bitpush 356 days ago
> Since a user requested the fetch, this fetcher generally ignores robots.txt rules.

Normally the expecation is that the user-agent faithfully presents the content it fetched.

If I make a browser that fetches bbc.com, and strips away ads and presented it to users - I would expect BBC to not like it and block the user-agent from accessing it. It isnt a robots.txt thing. It is a user-agent thing.