Hacker News new | ask | show | jobs
by crackalamoo 372 days ago
Yes, 100% this. And even more so for reasoning models, which have a different kind of RL workflow based on reasoning tokens. I expect to see research labs come out with more ways to use RL with LLMs in the future, especially for coding.

I feel it is quite important to dispel this idea given how widespread it is, even though it does gesture at the truth of how LLMs work in a way that's convenient for laypeople.

https://www.harysdalvi.com/blog/llms-dont-predict-next-word/