| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by crackalamoo 372 days ago

Yes, 100% this. And even more so for reasoning models, which have a different kind of RL workflow based on reasoning tokens. I expect to see research labs come out with more ways to use RL with LLMs in the future, especially for coding.

I feel it is quite important to dispel this idea given how widespread it is, even though it does gesture at the truth of how LLMs work in a way that's convenient for laypeople.

https://www.harysdalvi.com/blog/llms-dont-predict-next-word/