| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bjourne 2663 days ago
	Have you heard about the inverted index? It is the corner stone of all databases and information retrieval systems. Your question is quite fuzzy so it is hard to come up with a more precise answer.

1 comments

extra_rice 2663 days ago

Sorry, I didn't know how to make the question a bit clearer. Basically, it's: how do you ensure that queries on very large, highly dynamic datasets return in acceptable amount of time (especially if clients call/poll it at regular short intervals)?

link

sethammons 2663 days ago

indexes, caching (pass through, LRU, etc), query read replicas, sharding, pre-fetching, sampling, maybe look into columnar storage ... Hard to answer not knowing more specifics.

Something to always remember: if it is valuable, charge for it. If it is really valuable, you can spend all kinds of hardware on it. Give each customer their dedicated instance and rinse and repeat the strategies above.

link