|
|
|
|
|
by lovehashbrowns
67 days ago
|
|
On the platform at my work they scrape the same page multiple times, over and over. They do not care to cache anything. And it’s ridiculous to account for because for example for our properties, everything is news-based so warming the cache was as simple as loading the first X articles to get them into cache. But with AI that is not viable because they scrape as much as possible, articles from 2018, 2017. Management doesn’t want to block them though. It’s just suffering through the endless barrage. I was able to do a lot for this like heavier caching even with pgpool but it’s so crazy that this small subset of bots effectively accounts for like 60%+ of our spend. |
|