|
|
|
|
|
by demetris
129 days ago
|
|
Thank you! > The folks who crawl more appear to mostly be folks who are doing grounding or RAG, and also AI companies who think that they can build a better foundational model by going big. But how can they aspire to do any of that if they cannot build a basic bot? My case, which I know is the same for many people: My content is updated infrequently. Common Crawl must have all of it. I do not block Common Crawl, and I see it (the genuine one from the published ranges; not the fakes) visiting frequently. Yet the LLM bots hit the same URLs all the time, multiple times a day. I plan to start blocking more of them, even the User and Search variants. The situation is becoming absurd. |
|