|
|
|
|
|
by koolba
616 days ago
|
|
> The Bytespider bot, much like those of OpenAI and Anthropic, does not respect robots.txt, the research shows. Robots.txt is a line of code that publishers can put into a website that, while not legally binding in any way, is supposed to signal to scraper bots that they cannot take that website’s data. Does any of these scrapers uniquely and unambiguously identify themselves as a bot? Or are those days long over? |
|
Whether those days are over or not will greatly depend on the outcome of the ongoing New York Times vs OpenAI lawsuit. If OpenAI wins, then it pretty much green lights all the other scrappers to feast upon the web