Hacker News new | ask | show | jobs
by ziotom78 325 days ago
Just a couple of days ago, I submitted a few pages from the PDF of a PhD thesis written in French to ChatGPT, asking it to translate them into English. The first 2-3 pages were perfect, then the LLM started hallucinating, putting new sentences and removing parts. The interesting fact is that the added sentences were correct and generally on the spot: the result text sounded plausible, and only a careful comparison of each sentence revealed the truth. Near the end of the chapter, virtually nothing of what ChatGPT produced was directly related to the original text.
1 comments

Transformer models are excellent at translation, but next-token prediction is not the correct architecture for it. You want something more like seq2seq. Next token prediction cares more about local consistency (i.e., going off on a tangent with a self-consistent but totally fabricated "translation") than faithfulness.