|
|
|
|
|
by rfoo
440 days ago
|
|
I think the most interesting result [0] is, compared to our current benchmarks, on which scaling law is showing diminishing returns, what they did managed to tell apart large language models (Llama 405B, GPT-4.5) from not-so-large LMs. This could be really interesting if it wasn't due to trivial f-up (e.g. difference in inference speed). [0] Assuming the paper isn't flawed, haven't read it thoroughly yet. |
|