Hacker News new | ask | show | jobs
by simonw 974 days ago
There's an interesting new paper about this problem: https://arxiv.org/abs/2310.02226

"Think before you speak: Training Language Models With Pause Tokens"

Basic idea is to teach the LLM to occasionally insert a "pause" token, which outputs nothing but gives it a chance to perform another round of operations on the way to the answer.