| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jedberg 1833 days ago
	If you own the frontend and the database, sure. But this product hits other people's databases. The only way they can optimize this is by making queries it thinks the user might want. They can't cache the response because they have no way of knowing if it changed, since again, they don't control the data source. If I owned the database they were getting data from, I'd be mighty upset at the insane amount of useless queries they'd have to make guessing what the user wants.

1 comments

allyourhorses 1833 days ago

Easily solved by a user configurable staleness value with some reasonable default. Google don't crawl the entire web in response to every query because for the vast majority of queries it's unnecessary. For those where it might be necessary (like news), they instead crawl at a higher frequency or use some special flow (like they do for tweets), either way the result is seamless, involves no progress spinners and is well suited for the vast majority of users.

link

jedberg 1833 days ago

Funny you should bring up Google. Google is so hard on infrastructure that most big sites have special handling for Google scraper requests. At reddit we put google on their own slower server cluster just so they didn't break the website.

We only did this because of the extreme value Google brings via traffic. But most crawlers and other things that made speculative queries like that were just banned.

link