| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by n0id34 510 days ago
	Is AI fizzing out or just me? I feel like they're trying to smash out new models as fast as they can but in reality they're barely any different, it's turning into the smartphone market. New iPhone with a slightly better camera and slightly differently bevelled edges, get it NOW! But doesn't actually do anything better than the iPhone 6. Claude, GPT 4 onwards, and DeepSeek all feel the same to me. Okay to a point, then kinda useless. More like a more convenient specialised Google that you need to double check the results of.

2 comments

lordofgibbons 510 days ago

Boiling frog. The advances are happening so rapidly, but incrementally, that it's not being registered. It just seems like the normal state.

Compare LLMs from a year or two ago with the ones out today on practically any task. It's night and day difference.

This is specially so when you start taking into account these "reasoning" models. It's mind blowing how much better they are than "non-reasoning" models for tasks like planning and coding.

https://aider.chat/docs/leaderboards/#aider-polyglot-benchma...

link

n0id34 509 days ago

Hmmm I guess it's the way I use them then, because the latest models feel almost less intelligent than the likes of GPT4. Certainly not "night and day" difference from my daily or every other day use case experience. I guess it's probably far more noticeable on benchmarks and far more advanced stuff than I'm using, but I would have assumed that would be the minority and that the majority of people use it similar to how I do.

link

nextworddev 510 days ago

on the contrary, it's accelerating since they unlocked a new paradigm of scaling

link

ktzar 509 days ago

I don’t think they’ve improved much for common use since GPT-3.5, to be frank. They’re cheaper and more ubiquitous, yes, but when it comes to summarizing and generating basic text, they’re pretty much the same as they were back then

Maybe we're just getting more used to make it part of our workflow.

link