Hacker News new | ask | show | jobs
by siscia 306 days ago
Do you really?

Frontier models seems remarkably similar in performance.

Yeah some nuances for sure, but the whole article could apply to every model.

3 comments

4o on ChatGPT.com vs. Opus in an IDE is like cooking food without kitchen tools vs. using them. 4o is neither a coding-optimized model nor a reasoning model in general.
You're not pushing them hard enough if you're not seeing a vast difference between 4o and Opus. Or possibly they're equivalent in the field you're working in but I suspect it's the former.
Opus, in my opinion, is steps away from AGI. 4o doesn't come close.