| This inductive logic is way overblown. > Incredible, beat Llama 3 8B with 3.8B parameters after less than a week of release. Judging by a single benchmark? Without even trying it out with real world usage? > And on LMSYS English, Llama 3 8B is on par with GPT-4 (not GPT-4-Turbo), as well as Mistral-Large. Any potential caveat in such a leaderboard not withstanding, on that leaderboard alone, there is a huge gap between llama 3 8B and Mistral-Large, let alone any of the GPT-4. By the way, for beating benchmark, "Pretraining on the Test Set Is All You Need" |
As I've stated in other comments, yeah... Agreed, I'm stretching it a bit. It's just that any indication of a 3.8B model being in the vicinity of GPT-4 is huge.
I'm sure that when things are properly measured by third-parties it will show a more sober picture. But still, with good fine-tunes, we'll probably get close.
It's a very significant demonstration of what could be possible soon.