|
|
|
|
|
by morusrubra
448 days ago
|
|
Although I think it's likely that these are "AI" bots, the real problem is the proliferation of rich and crappy crawlers. Whether or not legacy crawlers respect robots.txt, etc., they do seem to be sophisticated enough to determine when they're stuck in a loop. The home organizations of these new crawlers seem to have more money than sense and are often getting stuck in large dynamic sites for months without retrieving any new information. Among all of the articles about building "bot traps," libraries realized that they have unwittingly been in the bot trapping business for years. |
|