Hacker News new | ask | show | jobs
by 3abiton 920 days ago
The LLM field is still messy at large, if you look at the rankings of model performance, they still do not reflect their usability in real life. I think one major challenge is to find a corresponding benchmark.