Hacker News new | ask | show | jobs
by otabdeveloper4 330 days ago
You have some sort of very confused idea of what reinforcement learning is. (Which is probably why you're being downvoted.)
1 comments

I suggest you reed something like the DeepSeek R1 paper, because you and everybody else here seems to have no clue how it works (which is not surprising tbh).