| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jessep 813 days ago

Two things interest me about Claude being better than GPT-4:

1) We are all breathless that it is better. But a year has passed since GPT4. It’s like we’re excited that someone beat Usain Bolt’s 100 meter time from when he was 7. Impressive, but … he’s twenty now, has been training like a maniac, and we’ll see what happens when he runs his next race.

2) It’s shown AI chat products have no switching costs right now. I now use mostly Claude and pay them money. Each chat is a universe that starts from scratch, so … very easy to switch. Curious if deeper integrations with my data, or real chat memory, will change that.

10 comments

tempusalaria 813 days ago

The current version of GPT-4 is 3 months old not 1 year old. Anthropic are legitimately ahead on performance for cost right now. But their API latency I don’t think matches OpenAI

We’ll see what GPT4.5 looks like in the next 6 months.

lynx23 813 days ago

Did you mean two or four months? Because 3 months is somewhere in december, and there were no updates around that time.

HarHarVeryFunny 813 days ago

I don't think it's just that Claude-3 seems on par with GPT-4, but rather the development timescales involved.

Anthropic as a company was only created, with some of the core LLM team members from OpenAI, around the same time GPT-3 came out (Anthropic CEO Dario Amodei's name is even on the GPT-3 "few-shot learners" paper). So, roughly speaking, in same time it took OpenAI (big established company, with lots of development momentum) to go from GPT-3 to GPT-4, Anthropic have gone from start-up with nothing to Claude-3 (via 1 & 2) which BEATS GPT-4. Clearly the pace of development at Anthropic is faster than that at OpenAI, and there is no OpenAI magic moat in play here.

Sure GPT-4 is a year old at this point, and OpenAI's next release (GPT-4.5 or 5) is going to be better than GPT-4 class models, but given Anthropic's momentum, the more interesting question is how long it will take Anthropic to match it or take the lead?

Inference cost is also an interesting issue... OpenAI have bet the farm on Microsoft, and Anthropic have gone with Amazon (AWS), who have built their own ML chips. I'd guess Athropic's inference cost is cheaper, maybe a lot cheaper. Can OpenAI compete with the cost of Claude-3 Haiku, which is getting rave reviews? It's input tokens are crazy cheap - $300 to input every word you'll ever speak in your entire life!

theturtletalks 813 days ago

Claude may be beat GPT-4 right now, but I remember ChatGPT in March 2023 being leagues better. Over the past year, it’s gotten regressive, but faster.

Claude is also lacking web browsing and code interpreter. I’m sure those will come, but where will GPT be by then? ChatGPT also offers an extensive free tier with voice. Claude’s free plan caps you as a few messages every few hours.

HarHarVeryFunny 813 days ago

Of course GPT-next should take the lead for a while, but with Anthropic, from a standing start, putting out 3 releases in same time it took OpenAI to put out 1, then how long is this lead going to last ?

It'll be interesting to see if Anthropic choose to match OpenAI feature-for-feature or just follow their own path.

jessep 813 days ago

Yeah, it's a good point, but I think that our intuitions are different on this one. I don't have a horse in this race, but my assumption is that the next OpenAI release will be a massive leap, that makes GPT 4/Claude 3 Opus look like toys. Perhaps you're right though, and Anthropic's curves with bend upward even more quickly, so that they get to that they'll start catching up more quickly, until eventually be they're ahead.

HarHarVeryFunny 813 days ago

Honestly who knows, but outside of Q-star rumors there's no indication that either company is doing anything much different from the other one, so I'd not expect any long-lasting difference in capability to open up. Maybe it will, though!

FWIW, Sam Altman has fairly recently said that the jump from GPT-4 to GPT-5 will be similar to that from GPT-3 to GPT-4, and also (recent Lex Fridman interview) that their goal is explicitly NOT to have releases that are shocking - but rather they want to have ones of incremental capability to give society time to adapt. Could be misdirection - who knows.

Amodei for his part has said that what Anthropic will release in 2024 will be a "sharper, more refined" (or words to that effect) version of what they have now, and not a "reality bender" (which he seemed to be implying maybe is coming, but not for a year or two).

londons_explore 813 days ago

They're comparing against gpt-4-0125-preview, which was released at the end of January 2024. So they really are beating the market leader for this test.

humansareok1 813 days ago

Model Updates != New Models.

GPT5 will be substantially better than even the latest GPT4 update.

nicce 813 days ago

What matters here is that what I can use today. I can either use Claude 3 or GPT 4. If the Claude is better, it is best on the market. Let’s see what the story is tomorrow.

humansareok1 813 days ago

Go ahead, no one is saying to stay with GPT4. But its disingenuous to compare a gpt-4-march-update to a completely new pretrained model like Claude 3 Opus.

nicce 813 days ago

It is not that disingenuous. We can only make claims based on the current data.

There can be even bigger competitors in the market, but because they stay quiet and do not publish results, we do not know about their capabilities. Who knows what Apple has been doing all this time? They sure have capabilities. Even if they make some random comments about the use of Gemini.

Until the data and proof has been provided, it is accurate to claim "the best model on the market". Everything else is hypothetical.

humansareok1 812 days ago

So you think whatever process produces a GPT4 update is completely equivalent to pretraining and RLHF'ing a brand new model with new architecture, more data, etc??

worldsayshi 813 days ago

ChatGPT does have at least a year head start so this doesn't seem surprising. This proves that OpenAI doesn't really have any secret sauce that others can't reproduce.

I suppose size will become the moat eventually but atm it looks like it could become anyone's game.

CuriouslyC 813 days ago

Size is absolutely not going to become the moat unless there's some hardware revolution that makes running big models very very cheap, but that requires a very large up-front capital cost to deploy. Big models are inefficient, and as smaller models improve there will be very few use cases where the big models are worth the compute.

falcor84 813 days ago

I imagine that going forward, the typical approach would be a multi-level LLM, such that there's a relatively small and quick model in front of the user, which can in turn decide to consult an "expert" larger model as part of its "system 2".

CuriouslyC 813 days ago

Absolutely, that is 100% the way things are going to go. What's going to happen is that eventually there will be an online model directory that a local agent knows how to query to identify other models to call in order to build up an answer. Local agents will be empowered with online learning since it won't be possible to pre-train on the model catalog.

heyjamesknight 813 days ago

And then at the top of that stack, we’ll have a single, master model controlling everything.

We could call it the Master Control Program.

worldsayshi 813 days ago

> as smaller models improve there will be very few use cases where the big models are worth the compute

I see very little evidence of this so far. The use cases I'm interested in just barely works on GPT-4 and lesser models give mostly garbage. I.e. function calling and inferring stuff like SQL queries. If there are smaller models that can do passable work on such use cases I'd be very interested to know.

CuriouslyC 813 days ago

Claude Haiku can do a LOT of the things you'd think you need GPT4 for. It's not as good at complex code and really tricky language use/abstractions, but it's very close for more superficial things, and you can call haiku like 60 times for each gpt4 call.

I bet you could do multiple prompt variations with haiku and then do answer combining to compete with GPT4-T/Opus at a fraction of the price.

worldsayshi 807 days ago

Interesting! I just discovered that Anthropic indeed officially support commercial API access in (at least) some EU countries. They just don't support GUI access in all those countries:

https://www.anthropic.com/supported-countries

mikkom 813 days ago

Anthropic is ex-openai so even if there is a secret sauce that openai uses, they might know it.

worldsayshi 813 days ago

Yeah you might be right but Google and Mistral aren't that far behind either:

https://twitter.com/NickADobos/status/1772764680639148285

Alifatisk 813 days ago

> We are all breathless that it is better. But a year has passed since GPT4. It’s like we’re excited that someone beat Usain Bolt’s 100 meter time from when he was 7.

Sounds like some sort of siding with closedAI (openAi), when I need to use an llm, I use whatever performs the best at the moment. It doesn’t matter who’s behind it to me, at the moment it is Claude.

I am not going to stick to ChatGPT because closedAi have been pioneers or because their product was one of the best.

I hope I didn’t sound too harsh, excuse me in that case.

danielbln 813 days ago

> closedAI (openAi)

Is this supposed to be clever? It's like saying M$ back in the 90s. Yeah, OpenAI doesn't deserve its own name, but maybe we can let that dead horse rest.

Alifatisk 813 days ago

No, it's not supposed to be clever. Just something I use whenever I mention them.

There is an extensions that does something similar, https://addons.mozilla.org/en-US/firefox/addon/openai-is-not...

artdigital 813 days ago

Claude has way too many safeguards for what it believes is correct to talk about and what isn’t. Not saying ChatGPT is better, it also got dumbed down a lot, but Claude is very heavy on being politically correct on everything.

Ironically the one I find the best for responses currently is Gemini Advanced.

I agree with you that there is no switching cost currently, I bounce between them a lot

bredren 813 days ago

Does this matter in pure software dev?

llm_trw 813 days ago

Yes, remember when github decided to rename master branches to main branches?

>I'm afraid I can't answer a question about slavery.

I'm getting refusals similar in idiocy to the above in production right now.

_huayra_ 813 days ago

Perhaps if one is a minor: https://news.ycombinator.com/item?id=39632959

j45 813 days ago

Hard things aren’t easy to do.

Openai is not only faster at updating, the updates deliver. Then things like sora out of nowhere.

It’s great to see other models keeping up or getting ahead because a year ago the gap was bigger

lenerdenator 813 days ago

What's a good way to have access to many chatbots in one place?

longnguyen 813 days ago

If you’re on macOS, give BoltAI[0] a try. Other than supporting multiple AI services and models, BoltAI also allows you yo create your own AI tools. Highlight the text, press a shortcut key then run a prompt against that text.

Disclaimer: I build it :D

[0]: https://boltai.com

gorbypark 813 days ago

I use an app called MindMac for macOS that works with nearly "all" of the APIs. I currently am using OpenAI, Anthropic and Mistral API keys with it, but it seems to support a ton of others as well.

budududuroiu 813 days ago

Not affiliated in any way, but I use openrouter.ai to pay per token rather than have monthly subscriptions

mupuff1234 813 days ago

MSFT trying to hedge their bets makes it seems like there's a decent chance OpenAI might have hit a few roadblocks (either technical or organizational)

JKCalhoun 813 days ago

Billion-dollar corporation hedging its bets is standard practice and I personally wouldn't read anything into it.

foobar_______ 813 days ago

I agree with your analogy. Also, there is a quite a bit of "standing on the shoulders of giants" kind of thing going on. Every company's latest release will/should be a bit better than the models released before it. AI enthusiasts are getting a bit annoying - "we got a new leader boys!!!!*!" for each new model released.