Hacker News new | ask | show | jobs
by KaoruAoiShiho 4 days ago
This is really held back by one bench (omniscience accuracy) where it's really very far behind otherwise i think it's got at least a couple of points higher.