Hacker News new | ask | show | jobs
Bluffbench is near saturation: LLMs can interpret counterintuitive plots (opensource.posit.co)
2 points by ionychal 3 days ago