Hacker News new | ask | show | jobs
by red2awn 128 days ago
Distilling from a teacher (Opus 4.5) and scaling RL more.
1 comments

So less parameters but "better" weights?