|
|
|
|
|
by minimaxir
974 days ago
|
|
The 768D-sized embeddings compared to OpenAI's 1536D embeddings are actually a feature outside of index size. In my experience, OpenAI's embeddings are overspecified and do very poorly with cosine similarity out of the box as they match syntax more than semantic meaning (which is important as that's the metric for RAG). Ideally you'd want cosine similarity in the range of [-1, 1] on a variety of data but in my experience the results are [0.6, 0.8]. |
|