Hacker News new | ask | show | jobs
by cjbgkagh 1142 days ago
They’re embeddings so they’re dense. There are few things easier than dense vector similarity.
1 comments

Embeddings for retrieval don't have to be. It is not unheard of to transform the raw embeddings to optimize them for retrieval; e.g., through binarization or hashing.
I was more making a distinction between embeddings and bag of words which are very very sparse matrices. The embedding dimensionality will not be anywhere near as high so this level of sparsity is a minor inconvenience.

Edit: also CPUs for this, yikes…