Hacker News new | ask | show | jobs
by Retr0id 102 days ago
Hm I can't see Opus 4.6 on there
1 comments

I tweeted at the OSUNLP and they're backed up on eval validation. In the meantime, here's the benchmark repo with the saved runs and also instructions on how to run it locally. https://github.com/theredsix/abp-online-mind2web-results