Y
Hacker News
new
|
ask
|
show
|
jobs
Understanding reinforcement learning for model training from scratch
(
medium.com
)
2 points
by
rajman187
306 days ago
1 comments
rajman187
306 days ago
An intuitive treatment of RLHF, TRPO, PPO, GRPO, DPO and RLAIF
link