Hacker News new | ask | show | jobs
by josh-sematic 795 days ago
They also stated that they are still training larger variants that will be more competitive:

> Our largest models are over 400B parameters and, while these models are still training, our team is excited about how they’re trending. Over the coming months, we’ll release multiple models with new capabilities including multimodality, the ability to converse in multiple languages, a much longer context window, and stronger overall capabilities.

1 comments

Anyone have any informed guesstimations as to where we might expect a 400b parameter model for llama 3 to land benchmark wise and performance wise, relative to this current llama 3 and relative to GPT-4?

I understand that parameters mean different things for different models, and llama two had 70 b parameters, so I'm wondering if anyone can contribute some guesstimation as to what might be expected with the larger model that they are teasing?

They are aiming to beat the current GPT4 and stand a fair chance, they are unlikly to hold the crown for long.
Right because the very little I've heard out of Sam Altman this year hinting at future updates suggests that there's something coming before we turn our calendars to 2025. So equaling or mildly exceeding GPT-4 will certainly be welcome, but could amount to a temporary stint as king of the mountain.
This is always the case.

But the fact that open models are beating state of the art from 6 months ago is really telling just how little moat there is around AI.

FB are over $10B into AI. The English Channel was a wide moat just not uncrossable.
Yes, but the amount they have invested into training llama3 even if you include all the hardware is in the low tens of millions. There are a _lot_ of companies who can afford that.

Hell there are not for profits that can afford that.

>This is always the case.

I mean anyone can throw out self evident general truisms about how there will always be new models and always new top dogs. It's a good generic assumption but I feel like I can make generic assumptions and general truisms just as well as the next person.

I'm more interested in divining in specific terms who we consider to be at the top currently, tomorrow and the day after tomorrow based on the specific things that have been reported thus far. And interestingly, thus far, the process hasn't been one of a regular rotation of temporary top dogs. It's been one top dog, Open AI's GPT, I would say that it currently is still, and when looking at what the future holds, it appears that it may have a temporary interruption before it once again is the top dog, so to speak.

That's not to say it'll always be the case but it seems like that's what our near future timeline has in store based on reporting, and it's piecing that near future together that I'm most interested in.

Google: "We Have No Moat, And Neither Does OpenAI"
Unless you are NVidia.
The benchmark for the latest checkpoint is pretty good: https://x.com/teknium1/status/1780991928726905050?s=46
Mark said in a podcast they are currently at MMLU 85, but it's still improving.