Hacker News new | ask | show | jobs
by ccgreg 293 days ago
That's not how search engines work. They have a good idea of which pages might be frequently updated. That's how "news search" works, and even small startup search engines like blekko had news search.
1 comments

Indeed. My understanding is that crawl is a real expense at scale so they optimize for "just enough" to catch most site update rhythms and then use other signals (like blog pings, or someone searching for a URL that's not yet crawled, etc) to selectively chase fresher content.
My experience is that a news crawl is not a big expense at scale, but so far I've only built one and inherited one. BTW No one uses blog pings, the latest hotness is IndexNow.