Hacker News new | ask | show | jobs
by crucio 813 days ago
how well would this work for 762 dimensional vectors? At 3072, they're starting with such a high number of dimensions that the accuracy loss may not be representative of what others would see
3 comments

You'd have to look at the precision-recall curves for your data set and make the trade-off. There are studies on this topic.
Generally, it seems that people are starting to see more problems when making vectors of fewer than 1,000 dimensions binary.
seems to work pretty well check this out: https://huggingface.co/blog/embedding-quantization