Hacker News new | ask | show | jobs
by digitcatphd 210 days ago
I find it a bit surprising GenAI has made it this far without this benchmark