Y
Hacker News
new
|
ask
|
show
|
jobs
DeepSeek R1 Theory Overview (GRPO and RL and SFT)
(
youtube.com
)
2 points
by
research_pie
497 days ago