Hacker News new | ask | show | jobs
by heipei 1579 days ago
If anyone wants to build a scalable ANN-index with single-stage filtering (i.e. not with the builtin vector index which does post-filtering), I suggest people try binarising and splitting their feature vectors, or using something like Product Quantization (PQ). Both approaches will return a list of fixed terms which can be indexed in Elasticsearch as keyword and then searched with a simple "terms" query.

Big fan of what Pinecone is doing, but I have too much invested into Elasticsearch/Lucene at this point in time to be considering anything else really, and with Elasticsearch I get everything in one box, including things like n-gram accelerated wildcard searches.