Hacker News new | ask | show | jobs
by TuringNYC 530 days ago
Serious question - if robots.txt are not being honored, is there a risk that there is a class action from tens of thousands of small sites against both the companies doing the crawling and individual directors/officers of these companies? Seems there would be some recourse if this is done at at large enough scale.
1 comments

No. robots.txt is not in any way a legally binding contract, no one is obligated to care about it.
If I have a "no publicity" sign in my mailbox and you dump 500 lbs of flyers and magazines by my door every week for a month and cause me to lose money dealing with all the trash, I think I'd have a reasonable ground to sue even if there's no contract saying you need to respect my wish.

End of the day the claim is someone's action caused someone else undue financial burden in an way that is not easily prevented beforehand, so I wouldn't say it's a 100% clear case but I'm also not sure a judge wouldn't entertain it.

I don't think you can sue over what amounts to an implied gentleman's agreement that one side never even agreed to and win but if you do, let us know.
You can sue whenever anyone harms you
You can sue whenever.

The suit itself is the mechanism for determining whether the harm existed.

And yes, of course, this presents much opportunity for abuse.

I didn't say no one could sue, anyone can sue anyone for anything if they have the time and the money. I said I didn't think someone could sue over non-compliance with robots.txt and win.

If it were possible, someone would have done it by now. It hasn't happened because robots.txt has absolutely no legal weight whatsoever. It's entirely voluntary, which means it's perfectly legal not to volunteer.

But if you or anyone else wants to waste their time tilting at legal windmills, have fun ¯\_(ツ)_/¯.

You don't even need to mention robots.txt, there's plenty of people that have been sued for crawling and had to stop it and pay damages, just lookup "crawling lawsuits".
Your verbs, “sue” and “win”, are separated by ~16 words of flowery language. It’s not surprising that people gave up partway through and reacted to just the first verb.
You can sue over literally anything, the parent comment could sue you if they could demonstrate your reply damaged them in some way.
We need a way to apply a click-through "user agreement" to crawlers