Y
Hacker News
new
|
ask
|
show
|
jobs
by
computerex
74 days ago
They are all autoregressive. They have just been trained to emit thinking tokens like any other tokens.