Hacker News new | ask | show | jobs
by conception 150 days ago
https://crfm.stanford.edu/helm/air-bench/latest/#/leaderboar...

This isn’t the gotcha question you think it is. AI safety is being defined and measured.

1 comments

Cool, another metric to game like they do the other ones.