|
|
|
|
|
by 8note
342 days ago
|
|
this sounds like a fun research area. do LLMs have plans about future tokens? how do we get 100 tokens of completion, and not just one output layer at a time? are there papers youve read that you can share that support the hypothesis? vs that the LLM doesnt have ideas about the future tokens when its predicting the next one? |
|
https://www.anthropic.com/research/tracing-thoughts-language...
See section “Does Claude plan its rhymes?”?