|
|
|
|
|
by zamadatix
374 days ago
|
|
I think the only way to be particularly impressed with new leading models lately is to hold the opinion all of the benchmarks are inaccurate and/or irrelevant and it's vibes/anecdotes where the model is really light years ahead. Otherwise you look at the numbers on e.g. lmarena and see it's claiming a ~16% preference win rate for gpt-3.5-turbo from November of 2023 over this new world-leading model from Google. |
|