Hacker News new | ask | show | jobs
by cateye 800 days ago
Odds Ratio Preference Optimization (ORPO): https://arxiv.org/abs/2403.07691