|
|
|
|
|
by tweakimp
210 days ago
|
|
Every time I see a table like this numbers go up. Can someone explain what this actually means? Is there just an improvement that some tests are solved in a better way or is this a breakthrough and this model can do something that all others can not? |
|
The questions AND the answers are public.
If the LLM manages through reasoning OR memory to repeat back the answer then they win.
The scores represent the % of correct answers they recalled.