Have you heard about the inverted index? It is the corner stone of all databases and information retrieval systems. Your question is quite fuzzy so it is hard to come up with a more precise answer.
Sorry, I didn't know how to make the question a bit clearer. Basically, it's: how do you ensure that queries on very large, highly dynamic datasets return in acceptable amount of time (especially if clients call/poll it at regular short intervals)?
indexes, caching (pass through, LRU, etc), query read replicas, sharding, pre-fetching, sampling, maybe look into columnar storage ... Hard to answer not knowing more specifics.
Something to always remember: if it is valuable, charge for it. If it is really valuable, you can spend all kinds of hardware on it. Give each customer their dedicated instance and rinse and repeat the strategies above.