Hacker News new | ask | show | jobs
by QubridAI 87 days ago
Honestly, this is pretty much how most of the new models operate nowadays: a base model combined with RL and some product-layer magic.