Hacker News new | ask | show | jobs
by metronius 4106 days ago
I dont think that you need so much pages - most of 150B pages will be never ever shown displayed at SERP.
1 comments

True, you just need a subset. Now how you you identify that subset without indexing the pages to find out whether each page is in the subset you need?

IIRC google used to scan different pages at very different frequencies. Quite possibly because it has assigns pages into subsets every time it indexes.