|
|
|
|
|
by ijk
259 days ago
|
|
Not strictly true: while this was previously believed to be the case, Anthropic demonstrated that transformers can "think ahead" in some sense, for example when planning rhymes in a poem [1]: > Instead, we found that Claude plans ahead. Before starting the second line, it began "thinking" of potential on-topic words that would rhyme with "grab it". Then, with these plans in mind, it writes a line to end with the planned word. They described the mechanism that it uses internally for planning [2]: > Language models are trained to predict the next word, one word at a time. Given this, one might think the model would rely on pure improvisation. However, we find compelling evidence for a planning mechanism. > Specifically, the model often activates features corresponding to candidate end-of-next-line words prior to writing the line, and makes use of these features to decide how to compose the line. [1]: https://www.anthropic.com/research/tracing-thoughts-language... [2]: https://transformer-circuits.pub/2025/attribution-graphs/bio... |
|