Hacker News new | ask | show | jobs
by mindvirus 883 days ago
Congrats to them!

What have your experiences with vector databases been? I've been using https://weaviate.io/ which works great, but just for little tech demos, so I'm not really sure how to compare one versus another or even what to look for really.

4 comments

We're using Postgres with the pg_vector extension for basically all of our projects. We know and love Postgres, it has a big track record, the extension is supported on all major managed cloud offerings, no new tooling needed, pg_vector supports HNSW indices for performance as well.

Once in a whole supabase slips into a project, but that's basically just Postgres with some bells and whistles on top.

I got nothing bad to say about Pinecone, We aviate, Chroma etc. but when it comes to dbs, I like to go with the devil I know.

You should use whatever works best for you unless you face some limitations. The issue is that vector databases are not databases but search engines. It is ACID vs BASE. A few thoughts on this https://qdrant.tech/articles/dedicated-service/
There are multiple vectordb available in the market, open source ones include Milvus, Qdrant, Weaviate etc. Cloud services include Zilliz Cloud (managed milvus), Qdrant Cloud, Weaviate Cloud etc. Try using a benchmark tool to evaluate them. Here is an open-source option for your reference: VectorDBBench (https://github.com/zilliztech/VectorDBBench)
why would anyone use a "benchmark" tool from a vendor (zilliz here) to test the performance of its competitors?
Good question. 1. VectorDBBench is an open-source benchmark that is pretty vendor-neutral. If you've used it or checked its leaderboard, you'll find that almost every database is ranked first or in the top place on specific metrics.

2. The key to using a benchmark is not to find which vector database is the "best" but which one is most suitable for your use cases and applications. Actually, there is no "universally best" vector database for any situation. So I don't think anyone should be able to cheat on this.

> VectorDBBench is an open-source benchmark that is pretty vendor-neutral

No this is not vendor neutral unless you can prove that zillz included all tests that would expose its weakness.

> If you've used it or checked its leaderboard

see, you called it "leaderbord", it is a good proof that it is not vendor neutral. By using such highly biased tool, its hosting url has both the name of your company and the term "leader", it implies that the company is a leader when compared to other "vector" databases. tell me how such dirty trick is called vendor neutral?

please, HN is a place full of professional developers, many of those including myself have been in the business long enough to tell what is cheap propaganda and what is real cool tech.

> Actually, there is no "universally best" vector database for any situation.

because whether there should be a product category called "vector database" is in doubt in the first place. as explain, it is a low tech stuff significantly easier to design & implement than today's regular databases (e.g. cockroachdb). its role will eventually be filled by other real databases.

It is open-sourced, and you can find its source code on GitHub. The leaderboard is just something easy for readers to have a quick look at. You're welcome to use this tool to generate your results.

I won't argue with you about whether "vector database" should be a product category. You are not the first or even the last person who has doubts about it. Maybe many years later, we'll have the answer.

could you please stop posting such advertisement? why would anyone read vector database benchmark results generated by a vector database vendor?
trying to answer your questions since you asked.
no, you were trying to actively promote a particular vendor - for well known reasons. HN is not your free advertising platform, could you please just pack up you ads and leave people alone?
I think one of the big advantages of qdrant is how easy it is to do a poc because it allows you to have an “in-memory” version of the database similar to sqlite. One of the big competitors, Milvus, comes with a fairly intricate docker-compose you have to spin up to try it.
Well sqlite now has a vector extension so it's super convenient for testing. Between that and fts5 sqlite can stand in for any advanced search service as far as poc are concerned.
Only for python!
I mean you’re storing vector embeddings, the chances those come from some torch model are reasonably high.
I've been using Qdrant. Can't speak highly enough of the core functionality. It's fast, good accuracy, easy to use etc.

I think there are some things I wish were easier, for example finding and updating points, and the UI could be better.