| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by codingjaguar 269 days ago

This quite aligns with our observation at Milvus. Recently, we helped several users migrate from pgvector as the workload grew substantially.

It’s worth recognising the strengths of pgvector:

• For small-to-medium scale workloads (e.g., up to millions of vectors, relatively static data), embedding storage and similarity queries inside Postgres can be a simple, familiar architecture.

• If you already use Postgres and your vector workloads are light (low QPS, few dimensions, little metadata filtering / low concurrency), then piggy-backing vector search on Postgres is attractive: minimal added infrastructure.

• For teams that don’t want to introduce a separate vector service, or want to keep things within an existing RDBMS, pgvector is a compelling choice.

From our experience helping users scale vector search in production, several pain-points emerge when scaling vector workloads inside a general-purpose RDBMS like Postgres:

1. Index build / update overhead • Postgres isn’t built from the ground-up for high-velocity vector insertions plus large-scale approximate nearest neighbour (ANN) index maintenance, for example, lacking RaBitQ binary quantization supported in purpose built vector db like Milvus.

• For large datasets (tens/hundreds of millions or beyond), building or rebuilding HNSW/IVF indices inside Postgres can be memory- and time-intensive.

• In production systems where vectors are continuously ingested, updated, deleted, this becomes operationally tricky.

2. Filtered search

• Many use-cases require combining vector similarity with scalar/metadata filters (e.g., “give me top 10 similar embeddings where user_status = ‘active’ AND time > X”).

• Need to understand low level planner to juggle pre-filtering, post-filtering, and planner’s cost model wasn’t built for vector similarity search. For a system not designed primarily as a vector DB, this gets complex. Users shouldn't have to worry about such low level details.

3. Lack of support for full-text search / hybrid search

• Purpose built vector db such as Milvus has mature full-text search / BM25 / Sparse vector support.

1 comments

tacoooooooo 269 days ago

well said! we demo'd milvus (or zilliz i should say,) and while we didn't ultimately go with it--it seems like a great option

link