| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by famouswaffles 1160 days ago

Not too shocking for me after this paper. https://arxiv.org/abs/2211.09066

You can teach GPT-3 arithmetic - https://imgur.com/a/w3DAYOi

Basically 100% accuracy up to about 13 digit addition and >90 after that.

What else can you teach GPT without changing weights ?

3 comments

gopalv 1159 days ago

> and >90 after that

This is such a circular thing, that I feel like it is amazing to see it.

The reason LLMs use a NN is because they're trying to encode a probability function for generating the passage.

And now, you are encoding another n-gram follower exercise (i.e 1+1 = 2) on top of it :)

link

Paul-Craft 1159 days ago

Yeah... and I'm kind of suspicious of the whole "without changing the weights" deal, because adding working context to the model, like telling it the algorithm for adding numbers really sounds like there's some model state that's getting mutated, even if it's not stored in a file called weights.dat or whatev.

link

cs702 1160 days ago

I meant shocking in the sense that it makes me gape in awe, but as I wrote, it's also, simultaneously, completely unsurprising given all the new emergent capabilities we keep discovering. We're in agreement :-)

link

famouswaffles 1159 days ago

Oh. yes well that's fair

link

mirashii 1160 days ago

> 100% accuracy up to about 13 digit addition

The graphs you just posted do not support that, they'd support at most 100% accuracy up to 4 digits.

link

sharemywin 1160 days ago

it's GPT so 13=4

link

famouswaffles 1160 days ago

It's 100 at 13 and extremely close to it prior to that. Maybe basically 100 is better.

link