Hacker News new | ask | show | jobs
by gertlabs 54 days ago
https://gertlabs.com already does this at scale.

An industry-standard benchmark shouldn't be hosted or designed by a lab producing the models, regardless.