|
|
|
|
|
by SparkyMcUnicorn
425 days ago
|
|
I had the same thought, although voyage is 32k vs 128k for cohere 4. Anecdotal evidence points to benchmarks correlating with result quality for data I've dealt with. I haven't spent a lot of time comparing results between models, because we were happy with the results after trying a few and tuning some settings. Unless my dataset lines up really well with a benchmark's dataset, creating my own benchmark is probably the only way to know which model is "best". |
|
It feels like embedding content that large -- especially in dense texts -- will lead to loss of fidelity/signal in the output vector.