|
|
|
|
|
by Sol-
348 days ago
|
|
Do the major AI companies actually honor robots.txt? Even if some of their publicly known crawlers might do it, surely they have surreptitious campaigns where they do some hidden crawling, just like how they illegally pirate books, images and user data to train on. |
|
Maybe sites could add "you must honor policies set in robots.txt" to something like a terms of service but I have no idea if that would have enough teeth for a crawler to give up.