Hacker News new | ask | show | jobs
by refulgentis 797 days ago
I thought the author was bringing in genAI for little reason...then I clicked through. For some reason storing vectors and chats is core to Redis' vision of its future. (https://redis.io/blog/the-future-of-redis/)

I'm, generally, a mobile dev so I'm not familiar with redis. My handwave-y understanding is its a in-memory key/value DB.

I don't understand how that brings anything to the table for genAI. Couldn't the pitch read the same if you were mongoDB, postgres, whoever?

Also, my goodness, my eternal enemy, the idea a vector DB is something different than keeping a store of file -> pair<string, list<double>>.

The odds you need a vector DB unless you're doing insanely high scale stuff with AI are very low. If you're doing consumer stuff, please use ONNX and keep the pair, and thus the file, local and private.

3 comments

Redis will be used for genAI as it's always used: answer queries faster. Users are not interesting in waiting, answers need to be immediate. Plus reducing load on whatever you got behind Redis is a nice bonus.
I did evaluate a few vector databases for our RAG PoCs with quite a significant amount of metadata for permission handling on both the vector and the query, and execution time was in the area of milliseconds as far as I remember. The RAG performance hit pales in comparison to what computing time larger LLMs need, so I am not sure you are on the right track here.
Naively, I don't understand how Redis would be involved at all. Ex. in simplest system set up, we're running O(seconds) network request that relays output from a GPU to a client. Again, I'm a naive mostly mobile dev, but I'd presume the same machine running inference would stream the response JSON.
A vector DB is the complete opposite of what you describe, it maps list<double> to pair<file, string>.

The queries it's good at are not "what vectors map to this filename", but "what pieces of text are closest to this vector, and what metadata do we have about them?" This is a non-trivial problem to solve if you don't want your queries to be O(n) where n is the dataset size.

This is useful because AI models can transform any kind of content (usually text or images) into vectors, in a way that content similar in meaning is transformed to vectors that are close to each other. This can be used e.g. find all documents related to your search query, even if your search keywords are never directly mentioned, to find articles similar to the one you're currently reading, to search images by their descriptions, or even to see how closely a user submission matches "undesirable" content, like spam or porn.

I agree that specialized vector databases are a little silly though, considering that Postgres and others have vector extensions now.

The specialized vector database performs well when processing pure vector tasks but performs badly when it comes to SQL compatibility and integration with the existing system; And the traditional database with vector algorithm or vector plug-in like ES, PG, and Redis, achieves the vector function, the advantage is that it is very easy to create tasks in a production environment, but when the data scale is relatively large, they will quickly encounter performance bottlenecks.

There is a new type of vector database that combines the best of both worlds, which is MyScale, the SQL vector database. You can refer to the following blogs to see the comparison. our comprehensive benchmark evaluation reveals that MyScale exceeds other products in terms of filtered vector search accuracy, performance, cost-efficiency, and index build time by a long way. Importantly, MyScale is the only product tested that delivers healthy search accuracy and QPS across various filter ratios.

https://myscale.com/blog/myscale-outperform-specialized-vect... https://myscale.com/blog/myscale-vs-postgres-opensearch/

I know vector DBs x embeddings, so I'm afraid I'm just awful at communicating: to wit, and much to my consternation, I have to write and maintain code for both image and text embeddings, on 6 platforms.

I think we're getting to the heart of my confusion, and I only assume it's because of different use cases/expectations on privacy.

Lets say I'm CEO of Mousetrap Inc., and I got this .txt file, our top secret plan for a better mousetrap.

I want genAI to pick out the parts about the new metal alloy.

I upload the file to B2BAI LLC, who turns it into List<String>, then we give it to the model and get back List<List<Float>>.

Vector DBs store the List<String> and the List<List<Float>> for retrieval.

I, the top secret mouse-trap inventor, do not want my plan stored on any 3rd party computer.

But, this app I use puts it in an a16z approved Vector DB™.

The vector DB provider now has the embeddings (List<List<Float>>) and the chunks (List<String>), which violate my desire to not have my top secret mousetrap plan stored at rest anywhere .

This is silly.

Big companies who are extremely protective of their secrets use the cloud. Even the US government isn't afraid to store classified information in AWS, and they're not joking around with secrecy.

Unless you're acting specifically against American interests, I can't imagine a situation in which a cloud company would actually steal your secrets.

If anything, I'd be afraid of a vector DB vendor getting hacked, but I don't think that most non-tech companies who want to use vector embeddings for their documents can provide better security themselves.

you're right, my threat model is vector DB provider gets hacked, like you.

It's not silly because it takes 1 swe week, max, from start to finish, to just do it in memory locally. You don't need the Vector DB(tm)

Vector databases aren't for key value retrieval, they're for similarity search. What's that got to do with onnx?
ONNX runs the ML model that is f(string) -> vector. The similarity search is done using those vectors, and needs to return the original strings.
Onnx allows arbitrary bundling and execution of ML models.. so maybe something to with the "run it local and private"?
Vector databases don't contain ML models. There is nothing that is learned. Here is a typical algorithm: https://www.pinecone.io/learn/series/faiss/hnsw/

It is all about performance; latency and recall.

Presumably the output of an ML model, the titular vector, and the chunk of text that created that vector are stored in a vector DB?

(that probably read aggressive, ignore my tone. At length: I run the model locally and store the vector locally, but I'm doing consumer use cases so I have different tradeoffs, so I'm glad to have someone who uses them interlocuting.)