| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by simonw 1021 days ago

There's an interesting new paper about this problem: https://arxiv.org/abs/2310.02226

"Think before you speak: Training Language Models With Pause Tokens"

Basic idea is to teach the LLM to occasionally insert a "pause" token, which outputs nothing but gives it a chance to perform another round of operations on the way to the answer.