Hacker News new | ask | show | jobs
by jsheard 638 days ago
The sitemap.xml spec already has fields for indicating the last time a page was changed and how often it's expected to change in the future, so that search engines can optimize their updates accordingly, but AI scrapers tend to disregard that and just download the same unchanged page 10,000 times for the hell of it.
1 comments

> sitemap.xml spec already has fields for indicating the last time a page was changed

I did not know that bit! I'm considering adding this to my site now, because it sounds like it would save a lot of resources for everyone. Do (m)any crawlers use this information in your experience?

https://developers.google.com/search/docs/crawling-indexing/...

Google ignores the priority and change-frequency fields, but they do use the last-modified field to skip pulling pages which haven't changed since their crawler last visited. Not sure exactly which signals Bing uses but they definitely use last-modified as well.