Hacker News new | ask | show | jobs
by cma 1452 days ago
Holy shit thats bad. Do unlisted youtube and gdrive share links get indexed through this?
1 comments

You’d assume those have proper robots.txt configuration?
I have a disallow all robots.txt for a production system. Have had from the beginning.

Bing indexes it. This is my first major security incident and I have no idea how to fix this without making everything totally shitty for the users.

Some services ignore global disallow, but will respect rules explicitly targeted at them.
I've put in a hard block for all crawlers on all pages. Works for my scenario I think. Hopefully they don't lie in their user agent. Then it's going to be really bad.