Hacker News new | ask | show | jobs
by MacsHeadroom 530 days ago
lmsys is a poor judge of coding quality since it is based on ratings from a single generation rather than agentic coding over multiple steps.