| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by siscia 354 days ago

Do you really?

Frontier models seems remarkably similar in performance.

Yeah some nuances for sure, but the whole article could apply to every model.

3 comments

arthur-st 354 days ago

4o on ChatGPT.com vs. Opus in an IDE is like cooking food without kitchen tools vs. using them. 4o is neither a coding-optimized model nor a reasoning model in general.

link

dnh44 354 days ago

You're not pushing them hard enough if you're not seeing a vast difference between 4o and Opus. Or possibly they're equivalent in the field you're working in but I suspect it's the former.

link

gronglo 354 days ago

Opus, in my opinion, is steps away from AGI. 4o doesn't come close.

link