Hacker News new | ask | show | jobs
by Ferrus91 313 days ago
This is using think time compute and reinforcement learning. I think this is going to plateau even faster than the initial LLM scaling though.