Hacker News new | ask | show | jobs
by nielstron 125 days ago
It could... but as pointed out by other the significance is unclear and per-model results have even less samples than the benchmark average. So: maybe :)