| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by TJSomething 972 days ago
	My impression is that attribution on limited datasets isn't terribly hard. If you can prompt the LLM to say a sentence that is approximately in the source material, then the nearest sentence vector in the source material can be looked up in a vector DB, which can attribute it in context. I think this might be one of the few places where LLMs can provide straightforward value, since it can work as a search engine that can accept vague queries, create approximate answers, fetch the real answers, translate the source material into layman's terms with citations, and allow the newly informed user to refine or dig deeper with that context. The most dangerous part is translation, and the data I've seen show that transformers almost never hallucinate on tasks where no external knowledge is needed.