Hacker News new | ask | show | jobs
by storus 200 days ago
Those don't have DPO/GRPO which arguably made some parts of RL obsolete.
2 comments

check out cs 336 stanford, they cover DPO/GRPO and relevant parts needed to train LLMs.
It's also covered by CS329H.
I can assure you that lacking knowledge in DPO (and especially GRPO it’s just stripped down PPO) is not a dealbreaker.