Hacker News new | ask | show | jobs
by Dirak 120 days ago
Praying this isn't another Llama4 situation where the benchmark numbers are cooked. 84.6% on Arc-AGI is incredible!