|
|
|
|
|
by avereveard
374 days ago
|
|
There's a new set of metrics that capture advances better than MMLU or it's pro version but nothing yet as standardized and specifically very few have a hidden set of tests to keep advancements from been from directional fine tuning. |
|