Hacker News new | ask | show | jobs
by jdietrich 506 days ago
It is quite remarkable that we are already at the stage where saying "this AI is about as competent as an inexperienced college graduate" constitutes criticism. It is entirely proper for people to be engaging sceptically with LLMs at their current level of capability, but I think we should also keep in mind the astonishingly rapid growth rate in their performance. LLMs were a toy two years ago, they're now a useful if flawed colleague, but what can we expect two years from now?
1 comments

I mean 2 years ago they were at about the same place, theres been very little practical gain from gpt4 in my opinion. No matter the model the fundamental failure cases have remained the same.
I disagree, context size alone has exploded from 8k to 200k now and that makes a huge difference. LLMs have also progressed significantly in many other metrics, code quality, understanding, etc. The recent reasoning models have upped the ante further, especially when combined with models that are good at editing code.
Reasoning models like o1 or QwQ absolutely destroy 4o in coding, let alone GPT-4 circa 2022.
Minor correction: GPT-4 was announced on March 14, 2023, less than two years ago. I don’t remember how much LLMs had been discussed as coding assistants before then, but it was Greg Brockman’s demonstration of using it to write code that first brought that capability to my attention:

https://www.youtube.com/live/outcGtbnMuQ?si=oTMA02ns_BJDRS4c...

Advances since then have indeed been remarkable.

This is not my experience at all, o1 fails in the same way 4o does.