Hacker News new | ask | show | jobs
by dotancohen 195 days ago
Thank you. I will google Group Relative Policy Optimization to learn about that and the other training methods. If you have any resources handy that I should be reading, that would be appreciated as well. Have a great weekend.
1 comments

Nothing off the top of my head! If you find anything good let me know. GRPO is a training technique likely not exactly what you'd do for benchmarking, but it's interesting to read about anyway. Glad I cuold help