Hacker News new | ask | show | jobs
by acedTrex 505 days ago
I mean 2 years ago they were at about the same place, theres been very little practical gain from gpt4 in my opinion. No matter the model the fundamental failure cases have remained the same.
2 comments

I disagree, context size alone has exploded from 8k to 200k now and that makes a huge difference. LLMs have also progressed significantly in many other metrics, code quality, understanding, etc. The recent reasoning models have upped the ante further, especially when combined with models that are good at editing code.
Reasoning models like o1 or QwQ absolutely destroy 4o in coding, let alone GPT-4 circa 2022.
Minor correction: GPT-4 was announced on March 14, 2023, less than two years ago. I don’t remember how much LLMs had been discussed as coding assistants before then, but it was Greg Brockman’s demonstration of using it to write code that first brought that capability to my attention:

https://www.youtube.com/live/outcGtbnMuQ?si=oTMA02ns_BJDRS4c...

Advances since then have indeed been remarkable.

This is not my experience at all, o1 fails in the same way 4o does.