|
|
|
|
|
by shri_krishna
989 days ago
|
|
The one DB fits all approach only works when the size of the database is really small and never grows. Imagine you have 100 customers. Each customer generates, on average, a million 1536 dimension vector embeddings (considering OpenAI Ada dimensions which is the most popular right now). That is 6GB (1536 x 4 bytes per dimension for f32 x 1000_000) of just embeddings PER CUSTOMER. If you use HNSW it will take at least that much of RAM if not more. If you use PQ (and variants) you can reduce the size of index in RAM to say 512MB-1GB per customer. It is still quite a lot of memory requirement. That is just the way it is and there is no way around it. Now imagine you are using that database for storing transactions and other day to day business ops that will still be storing millions of records but with small indexes. This would have ideally only required a single DB instance with a replica for redundancy. Now if you integrate Vectors into the equation, you will have to needlessly scale this DB both horizontally and vertically just to maintain a decent query/write performance to your DB (which would have ideally been extremely fast without embeddings in the mix). You will eventually separate the embeddings out as it makes no sense for the entire DB to be scaled just for the sake of scaling your embeddings. I am not even accounting for index generation for these vectors which will require nearly 100% of all CPU cores while the index is being generated (depending on type of ANN you are using) and which in turn would slow your DB to a crawl. |
|
Someone makes the example in another comment, but it’s analogous to OLTP vs OLAP