Hacker News new | ask | show | jobs
by Ldorigo 588 days ago
Do we a know whether the current SOTA foundation models (Gemini, gpt4o, Claude, etc) are actually all GPT-based (as in, causal models)?
2 comments

I feel a bit bad bringing this up, but should Gemini actually be considered SOTA?

They make impressive demos, but I can't recall any of their released models being at the top of any leaderboard.

EDIT: Sorry, looking into it a bit more now, they still seem to be at the top in term of the context window, so they got that going for them.

Leaderboards are misleading. Try diff models for YOUR task and you’ll see a wide variety of outputs compared to “official” rankings.
Ok, maybe I haven't experimented enough; so for which tasks is Gemini the SOTA?
GPT-based isn’t really a thing outside of openai (it’s just the commercial name for their models)

But I believe we’re confident that all major models are causal transformer models right now.

No reason to believe otherwise. If one of them was doing something different, they’d let us know in order to stand out.

No, they didn't get to co-opt that word.
They literally invented the term GPT…

“Transformer” is the name of the algorithm behind popular LLMs.

GPT is the name that openai gave to their models early on.

What does having invented a term have to do with this?

The Otis company invented the term escalator, and even had a trademark on it for a while, but does it mean that you'd only call one an escalator if it was made by them?

That's literally what the trademark means. At some point things become so dominant and generic a trademark is no longer successfully enforceable and you get escalators, bandaids, linoleum, taser, gasoline, etc.
What a small part of my original point that you choose to focus on… and to be so wrong about…

Why try arguing this?