Hacker News new | ask | show | jobs
by fud101 42 days ago
Will the RLVR mechanism be improved upon or is it in some sense optimal?