Hacker News new | ask | show | jobs
by gujun720 2349 days ago
Please check https://medium.com/@milvusio/managing-data-in-massive-scale-...

It explains how Milvus managing vectors.

1 comments

> "As each vector takes 2 KB space, the minimum storage space for 100 million vectors is about 200 GB"

Why are you not quantizing the vectors when you insert them? Bolt [1] and Quicker-ADC [2] make 10-100x compression basically free for approximate search (and also get you ~100x compression roughly 10x faster querying within a partition....)

[1] https://github.com/dblalock/bolt

[2] https://github.com/technicolor-research/faiss-quickeradc

200 GB is the size of original vectors. When creating index, Milvus supports IVF SQ8 and IVF PQ ADC.

Based on our users experience, SQ8 is the most balanced one at this moment. SQ8 provides 8x compression, higher accuracy and better performance.