|
|
|
|
|
by janalsncm
752 days ago
|
|
It’s not just that. Embeddings aren’t magic. If you’re going to be creating embeddings for similarity search, the first thing you need to ask yourself is what makes two vectors similar such that two embeddings should even be close together? There are a lot of related sources of similarity, but they’re slightly different. And I have no idea what Cohere is doing. Additionally, it’s not clear to me how queries can and should be embedded. Queries are typically much shorter than their associated documents, so they typically need to be trained jointly. Selling “embeddings as a service” is a bit like selling hashing as a service. There are a lot of different hash functions. Cryptographic hashes, locality sensitive hashes, hashes for checksum, etc. |
|
Are there other semantic search systems? What happened to the entire field of Information Retrieval - is vector search the only method? Are all the stemming, linguistic analysis, all that - all obsoleted by vectors?
Or is it purely because vector search is quick? That's just an engineering problem. I'm not convinced it's the only method here. Happy to be corrected!