Hacker News new | ask | show | jobs
by computerex 74 days ago
They are all autoregressive. They have just been trained to emit thinking tokens like any other tokens.