Hacker News new | ask | show | jobs
by meganrisdal 332 days ago
We love what Chatbot Arena is doing to innovate on evaluation paradigms. The challenge of evaluating GenAI warrants diverse approaches. What we're excited to do is: 1) give anyone access to infra to make evaluation more accessible to more developers and researchers; 2) drive more novel, diverse evals. https://arxiv.org/abs/2505.00612v2