| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jamesbriggs 1016 days ago

I may be misunderstanding, but I'll try to answer — quantization typically means retrieval will be slower (if referring to techniques like product quantization), but that is the case whether you're at 10K vectors or 1B vectors, afaik it doesn't really make a difference because you're only quantizing the query vector at query time (it has been awhile since I read anything on quantization, so I could be mistaken).

Maybe your question is referring to the need to have quantization at larger index sizes? In which case, yes would typically be true because you're wanting to either (1) minimize the index size when quantizing it, or (2) optimizing the query space (to search through less). Whether you want (1) or (2) will impact on the type of quantization being performed (basically 1 == product quantization, and 2 == inverted index)

So once you get to the 1M+ size, you need to consider quantization in some form - or you can go with graph-based retrieval, if you don't mind using a lot of disk space.