Y
Hacker News
new
|
ask
|
show
|
jobs
by
fud101
42 days ago
Will the RLVR mechanism be improved upon or is it in some sense optimal?