Hacker News new | ask | show | jobs
by jessep 813 days ago
Two things interest me about Claude being better than GPT-4:

1) We are all breathless that it is better. But a year has passed since GPT4. It’s like we’re excited that someone beat Usain Bolt’s 100 meter time from when he was 7. Impressive, but … he’s twenty now, has been training like a maniac, and we’ll see what happens when he runs his next race.

2) It’s shown AI chat products have no switching costs right now. I now use mostly Claude and pay them money. Each chat is a universe that starts from scratch, so … very easy to switch. Curious if deeper integrations with my data, or real chat memory, will change that.

10 comments

The current version of GPT-4 is 3 months old not 1 year old. Anthropic are legitimately ahead on performance for cost right now. But their API latency I don’t think matches OpenAI

We’ll see what GPT4.5 looks like in the next 6 months.

Did you mean two or four months? Because 3 months is somewhere in december, and there were no updates around that time.
I don't think it's just that Claude-3 seems on par with GPT-4, but rather the development timescales involved.

Anthropic as a company was only created, with some of the core LLM team members from OpenAI, around the same time GPT-3 came out (Anthropic CEO Dario Amodei's name is even on the GPT-3 "few-shot learners" paper). So, roughly speaking, in same time it took OpenAI (big established company, with lots of development momentum) to go from GPT-3 to GPT-4, Anthropic have gone from start-up with nothing to Claude-3 (via 1 & 2) which BEATS GPT-4. Clearly the pace of development at Anthropic is faster than that at OpenAI, and there is no OpenAI magic moat in play here.

Sure GPT-4 is a year old at this point, and OpenAI's next release (GPT-4.5 or 5) is going to be better than GPT-4 class models, but given Anthropic's momentum, the more interesting question is how long it will take Anthropic to match it or take the lead?

Inference cost is also an interesting issue... OpenAI have bet the farm on Microsoft, and Anthropic have gone with Amazon (AWS), who have built their own ML chips. I'd guess Athropic's inference cost is cheaper, maybe a lot cheaper. Can OpenAI compete with the cost of Claude-3 Haiku, which is getting rave reviews? It's input tokens are crazy cheap - $300 to input every word you'll ever speak in your entire life!

Claude may be beat GPT-4 right now, but I remember ChatGPT in March 2023 being leagues better. Over the past year, it’s gotten regressive, but faster.

Claude is also lacking web browsing and code interpreter. I’m sure those will come, but where will GPT be by then? ChatGPT also offers an extensive free tier with voice. Claude’s free plan caps you as a few messages every few hours.

Of course GPT-next should take the lead for a while, but with Anthropic, from a standing start, putting out 3 releases in same time it took OpenAI to put out 1, then how long is this lead going to last ?

It'll be interesting to see if Anthropic choose to match OpenAI feature-for-feature or just follow their own path.

Yeah, it's a good point, but I think that our intuitions are different on this one. I don't have a horse in this race, but my assumption is that the next OpenAI release will be a massive leap, that makes GPT 4/Claude 3 Opus look like toys. Perhaps you're right though, and Anthropic's curves with bend upward even more quickly, so that they get to that they'll start catching up more quickly, until eventually be they're ahead.
Honestly who knows, but outside of Q-star rumors there's no indication that either company is doing anything much different from the other one, so I'd not expect any long-lasting difference in capability to open up. Maybe it will, though!

FWIW, Sam Altman has fairly recently said that the jump from GPT-4 to GPT-5 will be similar to that from GPT-3 to GPT-4, and also (recent Lex Fridman interview) that their goal is explicitly NOT to have releases that are shocking - but rather they want to have ones of incremental capability to give society time to adapt. Could be misdirection - who knows.

Amodei for his part has said that what Anthropic will release in 2024 will be a "sharper, more refined" (or words to that effect) version of what they have now, and not a "reality bender" (which he seemed to be implying maybe is coming, but not for a year or two).

They're comparing against gpt-4-0125-preview, which was released at the end of January 2024. So they really are beating the market leader for this test.
Model Updates != New Models.

GPT5 will be substantially better than even the latest GPT4 update.

What matters here is that what I can use today. I can either use Claude 3 or GPT 4. If the Claude is better, it is best on the market. Let’s see what the story is tomorrow.
Go ahead, no one is saying to stay with GPT4. But its disingenuous to compare a gpt-4-march-update to a completely new pretrained model like Claude 3 Opus.
It is not that disingenuous. We can only make claims based on the current data.

There can be even bigger competitors in the market, but because they stay quiet and do not publish results, we do not know about their capabilities. Who knows what Apple has been doing all this time? They sure have capabilities. Even if they make some random comments about the use of Gemini.

Until the data and proof has been provided, it is accurate to claim "the best model on the market". Everything else is hypothetical.

So you think whatever process produces a GPT4 update is completely equivalent to pretraining and RLHF'ing a brand new model with new architecture, more data, etc??
ChatGPT does have at least a year head start so this doesn't seem surprising. This proves that OpenAI doesn't really have any secret sauce that others can't reproduce.

I suppose size will become the moat eventually but atm it looks like it could become anyone's game.

Size is absolutely not going to become the moat unless there's some hardware revolution that makes running big models very very cheap, but that requires a very large up-front capital cost to deploy. Big models are inefficient, and as smaller models improve there will be very few use cases where the big models are worth the compute.
I imagine that going forward, the typical approach would be a multi-level LLM, such that there's a relatively small and quick model in front of the user, which can in turn decide to consult an "expert" larger model as part of its "system 2".
Absolutely, that is 100% the way things are going to go. What's going to happen is that eventually there will be an online model directory that a local agent knows how to query to identify other models to call in order to build up an answer. Local agents will be empowered with online learning since it won't be possible to pre-train on the model catalog.
And then at the top of that stack, we’ll have a single, master model controlling everything.

We could call it the Master Control Program.

> as smaller models improve there will be very few use cases where the big models are worth the compute

I see very little evidence of this so far. The use cases I'm interested in just barely works on GPT-4 and lesser models give mostly garbage. I.e. function calling and inferring stuff like SQL queries. If there are smaller models that can do passable work on such use cases I'd be very interested to know.

Claude Haiku can do a LOT of the things you'd think you need GPT4 for. It's not as good at complex code and really tricky language use/abstractions, but it's very close for more superficial things, and you can call haiku like 60 times for each gpt4 call.

I bet you could do multiple prompt variations with haiku and then do answer combining to compete with GPT4-T/Opus at a fraction of the price.

Interesting! I just discovered that Anthropic indeed officially support commercial API access in (at least) some EU countries. They just don't support GUI access in all those countries:

https://www.anthropic.com/supported-countries

Anthropic is ex-openai so even if there is a secret sauce that openai uses, they might know it.
Yeah you might be right but Google and Mistral aren't that far behind either:

https://twitter.com/NickADobos/status/1772764680639148285

> We are all breathless that it is better. But a year has passed since GPT4. It’s like we’re excited that someone beat Usain Bolt’s 100 meter time from when he was 7.

Sounds like some sort of siding with closedAI (openAi), when I need to use an llm, I use whatever performs the best at the moment. It doesn’t matter who’s behind it to me, at the moment it is Claude.

I am not going to stick to ChatGPT because closedAi have been pioneers or because their product was one of the best.

I hope I didn’t sound too harsh, excuse me in that case.

> closedAI (openAi)

Is this supposed to be clever? It's like saying M$ back in the 90s. Yeah, OpenAI doesn't deserve its own name, but maybe we can let that dead horse rest.

No, it's not supposed to be clever. Just something I use whenever I mention them.

There is an extensions that does something similar, https://addons.mozilla.org/en-US/firefox/addon/openai-is-not...

Claude has way too many safeguards for what it believes is correct to talk about and what isn’t. Not saying ChatGPT is better, it also got dumbed down a lot, but Claude is very heavy on being politically correct on everything.

Ironically the one I find the best for responses currently is Gemini Advanced.

I agree with you that there is no switching cost currently, I bounce between them a lot

Does this matter in pure software dev?
Yes, remember when github decided to rename master branches to main branches?

>I'm afraid I can't answer a question about slavery.

I'm getting refusals similar in idiocy to the above in production right now.

Hard things aren’t easy to do.

Openai is not only faster at updating, the updates deliver. Then things like sora out of nowhere.

It’s great to see other models keeping up or getting ahead because a year ago the gap was bigger

What's a good way to have access to many chatbots in one place?
If you’re on macOS, give BoltAI[0] a try. Other than supporting multiple AI services and models, BoltAI also allows you yo create your own AI tools. Highlight the text, press a shortcut key then run a prompt against that text.

Disclaimer: I build it :D

[0]: https://boltai.com

I use an app called MindMac for macOS that works with nearly "all" of the APIs. I currently am using OpenAI, Anthropic and Mistral API keys with it, but it seems to support a ton of others as well.
Not affiliated in any way, but I use openrouter.ai to pay per token rather than have monthly subscriptions
MSFT trying to hedge their bets makes it seems like there's a decent chance OpenAI might have hit a few roadblocks (either technical or organizational)
Billion-dollar corporation hedging its bets is standard practice and I personally wouldn't read anything into it.
I agree with your analogy. Also, there is a quite a bit of "standing on the shoulders of giants" kind of thing going on. Every company's latest release will/should be a bit better than the models released before it. AI enthusiasts are getting a bit annoying - "we got a new leader boys!!!!*!" for each new model released.