Y
Hacker News
new
|
ask
|
show
|
jobs
Open-world evaluations for measuring frontier AI capabilities [pdf]
(
cruxevals.com
)
2 points
by
randomwalker
56 days ago