| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cs702 1160 days ago

In hindsight, it's the most natural, most obvious next step to get LLMs to write better code:

Explain to them how to debug and fix the code they've written.

Which is pretty much what you would do with an inexperienced human software developer.

Looking at this with fresh eyes, it's both shocking to me that this sort of thing is even possible, and yet also completely unsurprising as yet another emergent capability of LLMs.

We live in interesting times.

2 comments

hyperthesis 1159 days ago

Are they actually running the code, and evaluating the output? Or is it debug-by-code-review?

Beware of bugs in the above code; I have only proved it correct, not tried it. - Knuth

link

cs702 1159 days ago

They're doing both. Quoting from Figure 1, "the model first generates new code, then the code is executed and the model explains the code. The code explanation along with the execution results constitute the feedback message, which is then sent back to the model to perform more debugging steps. When unit tests are not available, the feedback can be purely based on code explanation."

link

hyperthesis 1158 days ago

So only evaluating output with unit tests - a fitness function. AITDD.

link

famouswaffles 1160 days ago

Not too shocking for me after this paper. https://arxiv.org/abs/2211.09066

You can teach GPT-3 arithmetic - https://imgur.com/a/w3DAYOi

Basically 100% accuracy up to about 13 digit addition and >90 after that.

What else can you teach GPT without changing weights ?

link

gopalv 1159 days ago

> and >90 after that

This is such a circular thing, that I feel like it is amazing to see it.

The reason LLMs use a NN is because they're trying to encode a probability function for generating the passage.

And now, you are encoding another n-gram follower exercise (i.e 1+1 = 2) on top of it :)

link

Paul-Craft 1159 days ago

Yeah... and I'm kind of suspicious of the whole "without changing the weights" deal, because adding working context to the model, like telling it the algorithm for adding numbers really sounds like there's some model state that's getting mutated, even if it's not stored in a file called weights.dat or whatev.

link

cs702 1159 days ago

I meant shocking in the sense that it makes me gape in awe, but as I wrote, it's also, simultaneously, completely unsurprising given all the new emergent capabilities we keep discovering. We're in agreement :-)

link

famouswaffles 1159 days ago

Oh. yes well that's fair

link

mirashii 1160 days ago

> 100% accuracy up to about 13 digit addition

The graphs you just posted do not support that, they'd support at most 100% accuracy up to 4 digits.

link

sharemywin 1160 days ago

it's GPT so 13=4

link

famouswaffles 1160 days ago

It's 100 at 13 and extremely close to it prior to that. Maybe basically 100 is better.

link