Y
Hacker News
new
|
ask
|
show
|
jobs
by
3abiton
920 days ago
The LLM field is still messy at large, if you look at the rankings of model performance, they still do not reflect their usability in real life. I think one major challenge is to find a corresponding benchmark.