|
|
|
|
|
by luke-stanley
773 days ago
|
|
Often embeddings are not so good for comparing similarity of text.
A cross-encoder might be a good alternative, perhaps as a second-pass, since you already have the embeddings.
https://www.sbert.net/docs/pretrained_cross-encoders.html
Pairwise, this can be quite slow, but as a second pass, it might be much higher quality. Obviously this gets into LLM's territory, but the language models for this can be small and more reliable than cosine on embeddings. |
|