Hacker News new | ask | show | jobs
by Zetobal 806 days ago
The only thing I learned in the last year that you can't really benchmark llms at all. Above a certain level it's just edge case against edge case or script kiddies and multi billion corps optimizing their fine tune against the test.