Hacker News new | ask | show | jobs
by consumer451 384 days ago
Yes, I mentioned that in the comment in the linked post. I wish someone was running this methodology as an ongoing project, for new models.

Ideally, isn't this a metric that should be included on all model cards? It seems like a crucial metric.