Hacker News new | ask | show | jobs
by BoiledCabbage 493 days ago
There is a reason why cars and computers are sold with specs. 0-60 time, fuel efficiency...

People need to know the performance they can expect from LLMs or agents. What are they capable of?

1 comments

A 2009 honda civic can get an under-5 seconds 0-60 easily... however it does involve high a cliff.

Result Specs (as in measuring output/experimental results) need strict definitions to be useful and I think the current ones with have for LLMs are pretty weak. (mostly benchmarks that model one kind of interaction, and usually not any sort of useful interaction)