|
|
|
|
|
by smeeth
1478 days ago
|
|
Yeah, totally, I get you. I'm not trying to do a takedown, ER is just hard. The point I was trying to make is that at scale one does not simply: > compare these embeddings with all the other embeddings you have You just can't, similarity metrics (especially cosine) on 768 dim arrays are prohibitively slow. Using embeddings is quite common in the literature and in deployment (I have, in fact, deployed ER that uses embeddings), but only as part of #2, the matching step. In many projects, doing full pairwise comparisons would take on the order of years, you have to do something else to refine the comparison sets first. |
|
Any reason you couldn't just dump it in FAISS or Annoy? No need to do pairwise comparisons. Annoy claims to handle even up to 1,000 dimensions reasonably. https://github.com/facebookresearch/faiss/issues/95