Hacker News new | ask | show | jobs
by wnmurphy 816 days ago
I think it's fairly simple: you're creating space for intermediary tokens to be generated, where those intermediary tokens represent "thoughts" or a simulated internal dialog.

Without that, it's analogous to asking someone a question and they immediately start responding from some information they'd heard before, rather than taking some time to have an inner dialog with themself.

1 comments

There's a recent paper which seeks to explicitly perform time-to-think using pause tokens[1].

> However sophisticated this end-to-end process may be, it abides by a peculiar constraint: the number of operations determining the next token is limited by the number of tokens seen so far.

There are obviously pros and cons to each, but nothing excludes us from combining the two either.

1. Think before you speak: Training Language Models With Pause Tokens https://arxiv.org/abs/2310.02226v2