Hacker News new | ask | show | jobs
by throw310822 113 days ago
> this might be misleadingly interpreted as an LLM having "thought out an answer"

I'm convinced that that is exactly what happens. Anthropic confirms it:

"Claude will plan what it will say many words ahead, and write to get to that destination. We show this in the realm of poetry, where it thinks of possible rhyming words in advance and writes the next line to get there. This is powerful evidence that even though models are trained to output one word at a time, they may think on much longer horizons to do so."

https://www.anthropic.com/research/tracing-thoughts-language...

1 comments

This is about reasoning tokens right? I didnt mean that, nanogpt doesnt do that. Nanogpt inference just outputs letters directly, no intermediate tokens.
No, this is about normal tokens. While a SOTA LLM outputs a token at a time, it already has a high level plan of what it is going to say many tokens ahead. This is in reply to the GP who thinks that an LLM can somehow produce coherent and thoughtful sentences while never seeing more than one token ahead.