Hacker News new | ask | show | jobs
by jacobn 37 days ago
I just complained to them the other day! They were scraping our weather website to no end, very much including the disallowed path prefixes.

Did end up just adding them to our WAF blocklist, which is weirdly ironic - hosting on their infra & using their services to block their AI scraper...

2 comments

I hope you leave it on the WAF. If they're only just deciding to respect robots.txt, which has been internet infrastructure forever, then it's probably still incredibly amateur software with 'Amazon-priorities' rather than 'responsible internet traffic' priorities.
The responsible internet is dead. Every big actor on the internet is selfish now that there's money involved. And has been for 20 years.

Google only respected it because blocking Google from crawling your site used to hurt you more than it hurt Google.

Time to switch to allow lists instead of block lists...
step 1: create the problem, step 2: sell the solution, step 3: profit