Hacker News new | ask | show | jobs
by be7a 55 days ago
Users get two completions for their prompt and rank them. From this you can then use Bradley-Terry to get Elo scores per model.