Y
Hacker News
new
|
ask
|
show
|
jobs
by
danielhanchen
490 days ago
Oh yep! The deepseek paper also mentioned how large enough LLMs inherently have responding capabilities and the goal of GRPO is to accentuate latent skills!