Hacker News new | ask | show | jobs
by sharemywin 320 days ago
the big step was having it reason through math problems that weren't in the training data. even now with web search it doesn't need every article in the training data to do useful things with it.
1 comments

This is using think time compute and reinforcement learning. I think this is going to plateau even faster than the initial LLM scaling though.