Hacker News new | ask | show | jobs
by vannevar 694 days ago
>throw out vector DB and embeddings into the trashcan (they are pulling junk information into the context and causing hallucinations)

Not sure why this would be true. In my experience, semantic search based on a vector index/embeddings pulls in more relevant information than a full-text keyword search. Maybe there is too broad a set of materials in your vector db, or the chunking strategy isn't good?

1 comments

It might depend on the case.

My problem with similarity search - it is unpredictable. It can sometimes miss really obvious matches or pull completely irrelevant snippets. When this happens - this causes downstream hallucinations that are hard to fix.

My customers don’t tolerate hallucinations.

Query expansion with FTS search works more predictably for me. Especially, if we factor in search scope reduction driven by the request classifier (“agent router”)

For sure it will depend on use case, if you have fairly structured data or a clear domain-specific terminology to rely on, there's probably no reason to use semantic search.

>Query expansion with FTS search works more predictably for me. Especially, if we factor in search scope reduction driven by the request classifier (“agent router”)

You might be able to quantify this and gain some insight into why query expansion/FTS is working better by comparing the precision/recall with a vector db using some set of benchmark docs and queries.

> For sure it will depend on use case, if you have fairly structured data or a clear domain-specific terminology to rely on

Indeed. This works only in a subset of business domains for me: search and assistants within enterprise knowledge base (e.g. ~40k documents with 20GB of text) within logistics, supply chain, legal, fintech and medtech.

> You might be able to quantify this and gain some insight into why query expansion/FTS is working better by comparing the precision/recall with a vector db using some set of benchmark docs and queries.

Embeddings tend to miss a lot of nuances, plus they are just unpredictable when searching on large sets of text (e.g. 40k documents fragmented), frequently pulling irrelevant texts before the relevant ones. Context contamination leads to hallucinations in our cases.

However with LLM-driven query expansion and FTS search I can get controllable retrieval quality in business tasks. Plus, if something edge case shows up, it is fairly easy to explain and adjust the query expansion logic to cover specific nuances.

This is the setup I'm happy with.