Hacker News new | ask | show | jobs
by cypress66 1262 days ago
It's hard to think of a benchmark for these LLMs that wouldn't be massively overfit (I don't mean via training, but via hyperparameter tweaking).