Hacker News new | ask | show | jobs
by upbeat_general 200 days ago
I can assure you that lacking knowledge in DPO (and especially GRPO it’s just stripped down PPO) is not a dealbreaker.