| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ArenaSource 1176 days ago
	If you put GPT-4 on a loop with access to the shell it manages to do whatever is needed to finish the job https://raw.githubusercontent.com/jla/gpt-shell/assets/examp...

2 comments

hellcow 1176 days ago

My experience with GPT-4 has been really disappointing. It didn't feel like a step up from 3.5.

As an example, I've been trying to use it to learn Zig since the official docs are ... spartan. And I've said, "here's my code, here's the error, what's wrong with it?" and it will go completely off the rails suggesting fixes that don't do anything (or are themselves wrong).

In my case, understanding/fixing the code would have required GPT-4 to know the difference between allocating on the stack/heap and the lifetimes of pointers. It never even approached the right solution.

I haven't yet gotten it to help me in even a single instance. Every suggestion is wrong or won't compile, and it can't reason through the errors iteratively to find a fix. I'm sure this has to do with a small sample of Zig code in its training set, but I reckon an expert C coder could have spotted the bug instantly.

link

dragonwriter 1176 days ago

If you are using GPT-4 to try to deal with the fact that technical documentation on the public internet is sparse for your topic of interest, you are likely to be disappointed, since GPT-4’s training set likely has the same problem, so you are, in effect, hoping it will fill in gaps in missing data, prompting hallucinations.

It’ll be much better on subjects where there is too much information on the public internet for a person to efficiently manage and sift through.

link

hellcow 1176 days ago

I think you're right. My hope was that it could reason through the problem using knowledge from related sources like C and an understanding below the syntax of what was actually happening.

But it most certainly did not.

link

refulgentis 1176 days ago

Depending on what you're doing, you might find few-shot techniques useful.

I used GPT 3.0 to maintain a code library in 4 languages, I'd write Dart (basically JS, so GPT knows it well), then give it a C++ equivalent of a function I had previously translated, and it could do any C++ from there.

link

gamegoblin 1176 days ago

1. GPT4 is learning from the same spartan docs as you, likely

2. GPT4's training data likely doesn't include significant Zig use, since large parts of its training data cut off a few years ago. I use Rust and it doesn't know about any recently added Rust features, either.

This has interesting implications because it means people will gravitate towards languages/frameworks/libraries that GPT knows well, which means even less training data will be generated for the new stuff. This is a form of value lock-in.

link

seu 1176 days ago

> This has interesting implications because it means people will gravitate towards languages/frameworks/libraries that GPT knows well, which means even less training data will be generated for the new stuff. This is a form of value lock-in.

That's the kind of problem that most people are just failing to see. The usage of this models might not in itself be problematic, but the changes that it bring are often unexpected and too deep for us to see clearly now. And yet, people are rushing towards them at full speed.

link

ignoramous 1176 days ago

It's inevitable, really. But that's like saying Washing Machine changed fashion. It might have, but the changes aren't all that abominable, either.

link

VectorLock 1176 days ago

GPT-4 is just regurgitating what its "learned" from previously scraped content on the Internet. If somebody didn't answer it on StackOverflow before 2021, it doesn't know it. It can't reason able anything, it doesn't "understand" stacks or pointers.

That said its really good at regurgitating stuff from StackOverflow. But once you step beyond anything that someone has previously done and posted to the Internet, it quickly gets out of its depth.

link

roflyear 1176 days ago

It's a step up by an order of magnitude for certain things. Like chess. It is really good at chess actually. But not programming. Seems maybe marginally better on average. Worse in some ways.

link

NicoJuicy 1176 days ago

It can't learn zig without plenty of samples

link

jcims 1176 days ago

Yeah I can't wait to get API access to gpt-4, it is a stepwise more capable based on the stuff I've done with chatgpt on gpt-4.

That said, even gpt-3.5 will try multiple routes to get to the same endpoint. It seems to get distracted pretty easily though.

link

hombre_fatal 1176 days ago

One demo of gpt-4’s superiority over gpt-3 is to come up with a prompt that determines the language of some given text.

I couldn’t figure out a gpt-3 prompt that could handle “This text is written in French” correctly (it thinks it’s written in French), but with gpt-4 you can include in the prompt to disregard what the text says and focus on the words and grammar that it uses.

link

ArenaSource 1176 days ago

> It seems to get distracted pretty easily though.

That’s true, gpt-4 is way more easy to guide with the system messages and it doesn’t forget the instructions as the conversation goes on.

link