Y
Hacker News
new
|
ask
|
show
|
jobs
by
machiaweliczny
126 days ago
We need that for this chinese 3B model that think 45s for hello world but also solves math.
1 comments
Bolwin
125 days ago
Nanbeige. Yeah this seems ideal for models that scale test time compute
link