Hacker News new | ask | show | jobs
by mdaniel 314 days ago
> Real internet-scale search systems like ES

Oh, then you must have the secret sauce that allows scaling ES vector search beyond 10,000 results without requiring infinite RAM. I know their forums would welcome it, because that question comes up a lot

Or I guess that's why you included the qualifier about money to invest

1 comments

Would you mind putting aside the snark? I have a couple questions. How large is the corpus? I am also curious about the use-case for top-k ANN, k > 10000?
Not the person you have asked but at work (we are a CRM platform) we allow our clients to arbitrarily query their userbase to find matching users for marketing campaigns (email, sms, whatsapp). These campaigns can some times target a few hundred thousand people. We are on a really ancient version of ES, but it sucks at this job in terms of throughput. Some experimenting with bigquery indicates it is so much better at mass exporting.
Fair; my question was mostly in the context of ANN, since that was the discussion point - I have to assume ES (as a search engine) would not necessarily be the right tool for data warehousing types of workloads.