Hacker News new | ask | show | jobs
by alansaber 54 days ago
I appreciate your reply but you are completely glossing over his point about how head to head model evals are useless lmao