Hacker News new | ask | show | jobs
by nemonemo 877 days ago
Typically benchmarks have limited aspects they are measuring. I can imagine another suite of benchmarks with longer contexts, but in that case, it might be more difficult to do it in a blind comparison form. At the least, it would be quite costly to run such benchmarks.