Y
Hacker News
new
|
ask
|
show
|
jobs
by
bryan0
509 days ago
Yes but these were steps were not used in R1-zero where its reasoning capabilities were trained.
1 comments
littlestymaar
509 days ago
And as a result R1-zero is way too crude to be used directly, which is a good indication that it remains relevant.
link