| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ajainvivek 120 days ago
	Your finding #2 is the most important — hallucinations are a retrieval problem, not a generation problem. We hit the same wall building legal document retrieval at Brainfish. Swapping embedding models gave incremental gains, but the real jump came from preserving document hierarchy (articles, sections, clauses) during retrieval instead of flattening everything into chunks. Curious, did you evaluate any structure-aware retrieval approaches on this benchmark, or purely embedding-based?