Hacker News new | ask | show | jobs
by allyourhorses 1841 days ago
Easily solved by a user configurable staleness value with some reasonable default. Google don't crawl the entire web in response to every query because for the vast majority of queries it's unnecessary. For those where it might be necessary (like news), they instead crawl at a higher frequency or use some special flow (like they do for tweets), either way the result is seamless, involves no progress spinners and is well suited for the vast majority of users.
1 comments

Funny you should bring up Google. Google is so hard on infrastructure that most big sites have special handling for Google scraper requests. At reddit we put google on their own slower server cluster just so they didn't break the website.

We only did this because of the extreme value Google brings via traffic. But most crawlers and other things that made speculative queries like that were just banned.