| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by leyoDeLionKin 755 days ago
	but y not just a vector database like pgvector?

5 comments

kernelsanderz 755 days ago

In practice, a combination of full text and vector databases often gives superior performance than just one of the types. It's called hybrid search. Here's an article that talks a bit about this: https://opster.com/guides/opensearch/opensearch-machine-lear...

Often you take the results from both vector search and lexical search and merge them through algorithms like Reciprocal Rank Fusion.

link

teraflop 755 days ago

You can think of a full-text index as being like a vector database that's highly specialized and optimized for the use-case where your documents and queries are both represented as "bags of words", i.e. very high-dimensional and very sparse.

Which works great when you want to retrieve documents that actually contain the specific keywords in your search query, as opposed to using embeddings to find something roughly in the same semantic ballpark.

link

demilich 755 days ago

Check https://github.com/infiniflow/infinity which combines vector search and full-text search providing extremely fast search performance.

link

jasfi 755 days ago

Infinity looks interesting, but I don't see any mention of support for clustering.

link

demilich 755 days ago

Infinity supports HNSW vector index.

link

CuriouslyC 754 days ago

Vector databases are good for documents, but if you have a fact database or some other more succinct information store, it's quite slow to retrieve compared to trigram/full text while often performing worse.

link

FridgeSeal 755 days ago

Because it’s a full text search engine, and not a text embedding? Different query types, requirements, indexing methods, etc.

link