|
|
|
|
|
by miki123211
798 days ago
|
|
A vector DB is the complete opposite of what you describe, it maps list<double> to pair<file, string>. The queries it's good at are not "what vectors map to this filename", but "what pieces of text are closest to this vector, and what metadata do we have about them?" This is a non-trivial problem to solve if you don't want your queries to be O(n) where n is the dataset size. This is useful because AI models can transform any kind of content (usually text or images) into vectors, in a way that content similar in meaning is transformed to vectors that are close to each other. This can be used e.g. find all documents related to your search query, even if your search keywords are never directly mentioned, to find articles similar to the one you're currently reading, to search images by their descriptions, or even to see how closely a user submission matches "undesirable" content, like spam or porn. I agree that specialized vector databases are a little silly though, considering that Postgres and others have vector extensions now. |
|
There is a new type of vector database that combines the best of both worlds, which is MyScale, the SQL vector database. You can refer to the following blogs to see the comparison. our comprehensive benchmark evaluation reveals that MyScale exceeds other products in terms of filtered vector search accuracy, performance, cost-efficiency, and index build time by a long way. Importantly, MyScale is the only product tested that delivers healthy search accuracy and QPS across various filter ratios.
https://myscale.com/blog/myscale-outperform-specialized-vect... https://myscale.com/blog/myscale-vs-postgres-opensearch/