Hacker News new | ask | show | jobs
by 9dev 1459 days ago
You’d assume those have proper robots.txt configuration?
1 comments

I have a disallow all robots.txt for a production system. Have had from the beginning.

Bing indexes it. This is my first major security incident and I have no idea how to fix this without making everything totally shitty for the users.

Some services ignore global disallow, but will respect rules explicitly targeted at them.
I've put in a hard block for all crawlers on all pages. Works for my scenario I think. Hopefully they don't lie in their user agent. Then it's going to be really bad.