|
|
|
|
|
by infecto
872 days ago
|
|
I am excited to see how the vector search space plays out. Most of my work is not constrained by a low latency chat type user experience and I have not touched most of the vector search apis. I wonder what the difference is between competitors. The way I picture it is everyone is starting up their own Elasticsearch hosted solution and while there are some differences in functionality, the real bet is cost and scale. |
|
Re embeddings, you would likely get better results if you train your own embeddings model. A popular approach is ColBERT, which anecdotally outperforms vector search in border cases[1]. Second is training an embedding model using initial layers of an LLM. [2]. In Colbert's case once it's trained, you dont need a db to store the vectors.
[1]: https://twitter.com/arjunkmrm/status/1744741903646773674 [2]: https://huggingface.co/intfloat/e5-mistral-7b-instruct