|
|
|
|
|
by eric-burel
321 days ago
|
|
They are normal scrapers nothing specific to LLM as they are not yet used for training an LLM, unless I miss something from their architecture. So I don't get why they would be called LLM crawlers, when they are search engine crawlers. At least they could be called RAG crawlers for better nuance. The article linked in the post first sentence is more precise as it deals with scrapers: https://techcrunch.com/2025/08/04/perplexity-accused-of-scra...
Some people may be ok with search engines but not LLM training so it's not the same deal. |
|
> RAG crawlers
Very few people know what "RAG" is, so it makes little sense to mention it to any other than a technical audience.
> not LLM training
There's an issue of trust, because once content is scraped, it can also be used to train future models. That's really what ought to be emphasized IMO.