Hacker News new | ask | show | jobs
by gr3ml1n 498 days ago
This feels like a category mistake. Why would R1 make RLHF obsolete?