| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by boringuser2 1170 days ago
	I've been using GPT-4 extensively for programming and its consistent failures at novel tasks have kind of left me less excited.

6 comments

inciampati 1170 days ago

It is simply unable to do anything novel. I've had arguments with friends about this, specifically in reference to the paper "Sparks of Artificial General Intelligence: Early experiments with GPT-4", which is wonderful and presents some amazing capabilities, many of which I use constantly for work every single day. But, these capabilities seem to all be within the range of data that it's trained on. Or they can be seen as interpolations, which are as novel as the prompter can suggest, but which are clearly derivatives of modes in the data and not of deep understanding of abstract concepts.

It's amazing stuff. But it totally fails to take the prompter anywhere new without extensive support, and it is still at a very shallow level of understanding with complex topics that require precision. For instance, turning a mathematical description of a completely novel (or just rare or unusual) algorithm into code will almost never work, and is more likely to generate a mess that takes lots of effort to clean up. And it's also extremely hard to get the model to self reflect and stop when it doesn't understand something. It is at present almost incapable of saying "I don't have enough information or structure to do X".

If we are already as deep into a realm of diminishing marginal returns as the GPT-4 white paper suggests, we might indeed be approaching a limit for this specific approach. No wonder someone is trying to dig a regulatory moat as fast as they can!

link

panarky 1169 days ago

The vast majority of my time is not spent on anything brand new and never seen before. I guess it's an interesting philosophical discussion about what creativity actually is, but for practical purposes, this thing is already an accelerator of routine work for me.

Maybe its capabilities hit a wall at GPT-5 or GPT-7, but I'd guess there's a lot of gas left in the tank, and there's probably someone in their apartment right now thinking up what's next after transformers.

link

unsupp0rted 1170 days ago

It keeps getting stuff almost right and then when I make adjustments it fixes what I asked for but reverses previous things I asked for.

It’s like working on a project with an intermediate dev who keeps getting switched out for a brand new intermediate dev multiple times an hour.

link

rapsacnz 1170 days ago

I've found this too. It confidently generates something for you... which turns out to be no good if it's in any way different from standard stuff. And making that leap from just a mash up of copy-pasta code to actual understanding... is huge. It wouldn't be an incremental upgrade, but a fundamental change in approach.

link

flyinglizard 1170 days ago

My experience it will just hallucinate what I'm after if I dare venture off the standard route. It's a rival for search, somewhat less so for expert humans.

link

goldfeld 1170 days ago

Not even novel or programming tasks alone, I use it to edit a Chinese newsletter[0] and it can never correctly guess the Chinese rock song from title and artist, always picks some pop tune instead, and otherwise mixes lyrics of separate song with no apparent reason.

0: https://chinesememe.substack.com/i/103754530/chinesepython

link

chasd00 1170 days ago

It’s only going to output the most probable response given the prompt and the data it was trained on. How could it be expected to solve something novel it has never seen in the training data?

link

malshe 1170 days ago

My experience is the same. I was also surprised by the made up non-functional code that looks Ok on the surface.

link