|
|
|
|
|
by estreeper
960 days ago
|
|
For embeddings specifically, there are multiple open source models that outperform OpenAI’s best model (text-embedding-ada-002) that you can see on the MTEB Leaderboard [1] > embedding-based approach will be cheaper and faster, but worse result than full text I’m not sure results would be worse, I think it depends on the extent to which the models are able to ignore irrelevant context, which is a problem [2]. Using retrieval can come closer to providing only relevant context. 1. https://huggingface.co/spaces/mteb/leaderboard 2. https://arxiv.org/abs/2302.00093 |
|
The point isn't about leaderboard. With increasing context length, the question is on whether we need embeddings or not. With longer context length, embeddings is no longer a necessity, and it lowers its value.