| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by fzliu 886 days ago

You need different indexing algorithms for different use cases - brute-force indexing, for example, is "SOTA" when it comes to recall (100%). If you have multiple use cases or if you might have domain shift, you'll want a vector database that supports multiple indexes.

Here's my 2¢:

- If you're just playing around with vector search locally and have a very small dataset, use brute-force search. Don't worry about indexes until later.

- If you have plenty of RAM and CPU cores and would like to squeeze out the most performance, use ScaNN or HNSW plus some form of quantization (product quantization or scalar quantization).

- If you have limited RAM, use IVF plus PQ or SQ.

- If you want to maintain reasonable latency but aren't very concerned about throughput, use a disk-based index such as DiskANN or Starling. https://arxiv.org/pdf/2401.02116.pdf

- If you have a GPU, use GPU-specific indexes. CAGRA (supported in Milvus!) seems to be one of the best. https://arxiv.org/abs/2308.15136

All of these indexes are supported in Milvus (https://milvus.io/docs/index.md), so you can pick and choose the right one for your application. Tree-based indexes such as Annoy don't seem to have a sweet spot just yet, but I think there's room for improvement in this subvertical.