Hacker News new | ask | show | jobs
by anon291 620 days ago
Current LLMs are one-shot. They are forced to produce an output without thinking, leading to the preponderance of hallucinations and lack of formal reasoning. Human formal reasoning is not instinctual. Unlike 'aha!' moments, it requires us to think. Part of that thinking process is turning our attention inwards into our own mind and using symbolic manipulations that we do not utter in order to 'think'.

LLMs broadly are capable of this, but we force them to not do it by forcing the next token to be the final output.

The human equivalent would be to solve a problem and show all your steps including steps that are wrong but that you undertook anyway. Hence why chain of reasoning works.

The 'fix' is to allow LLMS to pause, generate tokens that are not transliterated into text, and then signal when they want to unpause. Training such a system is left as an exercise to the reader, although there have been attempts