Hacker News new | ask | show | jobs
by diarrhea 927 days ago
Just the other day I played with qdrant, using its Python client. Pretty smooth onboarding experience.

I came across two questions. Perhaps some kind folks with more experience can shed some light on these qdrant use cases.

1. for embeddings for use cases such as LLM chat bots, I split internal data into chunks. Those chunks are then vectorized and stored. Alongside the entry itself, I stored the original chunk in metadata. That way, a lookup can immediately feed that into the LLM prompt context, without lookup in a secondary data store by some ID. Feels like a hack. Is that a sensible use case?

2. I resorted to using `fastembed` and generated all embedding client-side. Why is it that qdrant queries, in the ordinary case (also showcased a lot in their docs, e.g. [0]), expect a ready-made vector? I thought the point of vector DBs was to vectorize input data, store it, and later vectorize any text queries themselves?

Having to do all that client-side feels besides the point; for example, what if two separate clients use separate models (I used [1])? Their vectorizations will differ. I thought the DB is the source of truth here.

In any case, fascinating technology. Thanks for putting it together and making it this accessible.

[0]: https://qdrant.tech/documentation/quick-start/#run-a-query

[1]: `sentence-transformers/all-MiniLM-L6-v2`, following https://qdrant.tech/documentation/tutorials/neural-search-fa...

3 comments

Your observations for using a vector DB for retrieval-augmented generation are consistent with my own.

For my applications, I use pgvector since I can also use fulltext indexes and JOINs with the rest of my business logic which is stored in a postgres database. This also makes it easier to implement hybrid search, where the fulltext results and semantic search results are combined and reranked.

I think the main selling-point for standalone vector databases is scale, i.e., when you have a single "corpus" of over 10^7 chunks and embedding vectors that needs to serve hundreds of req/s. In my opinion, the overhead of maintaining a separate database that requires syncing with your primary database did not make sense for my application.

1. Yes, that's reasonable and saves running another DB

2. You often can perform the embedding in the DB, but there are a lot of use cases where you want to manage your embedding models outside the DB. This way you aren't dependent on which models the DB supports and you don't duplicate them throughout your system

You can have a look at this sheet: https://docs.google.com/spreadsheets/d/170HErOyOkLDjQfy3TJ6a...

It shows which Vector DBs have a particular feature. "In-built Text Embeddings creation" is a column you can look at.