Y
Hacker News
new
|
ask
|
show
|
jobs
by
srush
234 days ago
There is a footnote that should help with the models. Training is a harder thing to report on, but roughly our finding here is that RL scales.