|
|
|
|
|
by nemonemo
877 days ago
|
|
Typically benchmarks have limited aspects they are measuring. I can imagine another suite of benchmarks with longer contexts, but in that case, it might be more difficult to do it in a blind comparison form. At the least, it would be quite costly to run such benchmarks. |
|