Hacker News new | ask | show | jobs
by lostmsu 502 days ago
This does not make sense. If R1 scales similarly to other GPTs, throwing 100x more compute at it will produce an even stronger model.