| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dr_dshiv 1171 days ago
	I find Claude significantly better than 3.5. I’d love to be able to make the case for that with data…

2 comments

sanxiyn 1171 days ago

Since Chatbot Arena Leaderboard https://lmsys.org/blog/2023-05-10-leaderboard/ agrees with you, it's not just you.

link

famouswaffles 1171 days ago

There are 2 main claude models. I'm guessing it's claude-v1.3 aka claude plus that you find much better than 3.5 ? That tracks if so.

link

phillipcarter 1171 days ago

I've found for my use case that both claude-instant-* and claude-* are roughly on par with each other and gpt-3.5. claude-* seems to be the least inaccurate, but we also haven't put it into production like gpt-3.5, so it's hard to say for sure.

In either case, the claude models are very good. I think they'd do fine in a real product. But there's definitely issues that they all have (or that my prompt engineering has).

link