Hacker News new | ask | show | jobs
Training Process Reward Models in Axolotl (axolotlai.substack.com)
2 points by desideratum 479 days ago