|
|
|
|
|
by demetris
119 days ago
|
|
I don’t believe resips will be with us for long, at least not to the extent they are now. There is pressure and there are strong commercial interests against the whole thing. I think the problem will solve itself in some part. Also, I always wonder about Common Crawl: Is there is something wrong with it? Is it badly designed? What is it that all the trainers cannot find there so they need to crawl our sites over and over again for the exact same stuff, each on its own? |
|
The folks who crawl more appear to mostly be folks who are doing grounding or RAG, and also AI companies who think that they can build a better foundational model by going big. We recommend that all of these folks respect robots.txt and rate limits.