| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ramoz 812 days ago

I’ve worked in scaled enterprise search, both with lexical (lucene based, eg elastic search) & semantic search engines (vector retrieval).

Vector retrieval that isn’t contextualized in the domain is usually bad (RAG solutions call this “naive rag” … and make up for it with funky chunking and retrieval ensembles). Training custom retrievers and reranker is often key but quite an effort and still hard to generalize in a domain with broad knowledge.

Lexical based searching provides nice guarantees and deterministic control in results (depending on how you index). Certainly useful here is advanced querying capability. Constructing/enriching queries with transformers is cool.

Reranking is often nice ensemble additions, albeit can be done with smaller models.