| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mjburgess 961 days ago

So I was with a financial researcher recently, and he wanted to use ChatGPT to summarise some reference financial data -- and it did so, actually correctly.

Being sceptical, as every person ought in these matters, I changed the finical data and performed the same analysis (both in a new tab, and within the same convo). The results were the same!

How strange?

Well, in being reference financial data ChatGPT was reporting prior reference summaries of it. When that data was changed it was reporting the very same reference summaries (which were now wrong).

Since it's incapable of actually summarising financial data. It's only capable of selecting combinations of pieces of its training set.

Now, is this distinction "meaningless" ?

No, it's the difference between this guy being fired for causing a massive loss on a major project; and this guy keeping his job and doing it well.

2 comments

famouswaffles 961 days ago

>Since it's incapable of actually summarising financial data. It's only capable of selecting combinations of pieces of its training set.

Third completely off misconception from you today.

This is not at all what it is doing. "Supercharged Interpolation" is false and makes no sense. It's not a lookup table either. It doesn't memorize enough of what it needs to to make your assertion possible.

https://arxiv.org/abs/2110.09485

link

mjburgess 961 days ago

at 500gb, you can store nearly everything ever written -- let alone compressed.

all statistical learning is a variation on k-nn (see the relevant paper on this) but likewise this is obvious a priori

k-nn is the ideal learner, and a good starting point for analysis

the question for any given system is: what is the learning space, what is the distance function, and how many points are being considered

NNs set up a compressed X,y space, in that space choose points via an empirical expectation, and obtain a weighted average as their prediction

That's just what they do -- there isn't any other mechanism here. The whole formal structure of the NN can be written down on a page of paper

your paper above doesn't deal with this -- it's a reply to the 'forced interpolation' view, which i haven't espoused. but often NNs are forced interpolated

'extrapolation' is of course a part of the possible predictive output of a statical learning system -- in that it's latent space is taken to be embedded in R^n and so one can 'veer off' into R.

Whenever you attribute a higher fidelity space to a small latent space you are, in effect, extrapolating

link

famouswaffles 961 days ago

>at 500gb, you can store nearly everything ever written -- let alone compressed.

No you cannot.

>That's just what they do -- there isn't any other mechanism here.

That's not what they do. They are many papers now showing ICL demonstrating some kind of optimization method during inference which would not be happening if all they did was retrieval.

I'm come to realize you don't know what you're talking about. Your level of denial is scary to see.

link

mjburgess 961 days ago

just do the calculation yourself: how many books is 500gb at, say, a few bits per character?

more than all every written -- and so on

perhaps apply a single drop of scepticism to this credulity

even, just ask chatgpt to repeat the first paragraph of some book -- say, a dickens novel

link

famouswaffles 960 days ago

>how many books is 500gb at, say, a few bits per character?

In what reality is a character taking up only a few bits ? Certainly isn't this one.

Your denial is so weird. Why does ICL use Higher-Order Optimization Methods when it's just a lookup table ? https://arxiv.org/abs/2310.17086

How do you explain small language models that have much less space than the text they were trained on even with your nonsensical calculations ?

LLMs are not lookup tables and there's plenty evidence to support that. You look insane insisting they are.

>even, just ask chatgpt to repeat the first paragraph of some book -- say, a dickens novel

Not only will this not work for the vast majority of books it sees during training, Why only the first paragraph ? are you not insisting they memorize everything ? Why can't it repeat the whole page ? the whole chapter ? the whole book ?

link

lbalazscs 960 days ago

I actually did that, and I asked for the first paragraph from "Great Expectations", and the answer was correct.

Of course the first paragraph of any famous novel is in itself famous, so I also asked for the first paragraph from Chapter XL, from the same novel. Interestingly the first 12 words were correct, but the rest was hallucination.

Original: It was fortunate for me that I had to take precautions to ensure (so far as I could) the safety of my dreaded visitor; for, this thought pressing on me when I awoke, held other thoughts in a confused concourse at a distance.

ChatGPT 3.5: "It was fortunate for me that I had to take precautions to ensure the safety of my treasure, for with such a boy as Compeyson on the loose, I was not likely to find much mercy at his hands. I had not told Herbert about the money, and I was glad to have the chance to count it before he came back. I found that I had still twenty-one sovereigns, six shillings, and sixpence left."

link

kristiandupont 961 days ago

>Since it's incapable of actually summarising financial data

It's not, though. It is in fact able to summarize financial data, just as it's able to write code and diagnose a medical condition. It makes mistakes, yes, even grave ones, much more so than experts in those fields would.

link

mjburgess 961 days ago

It isnt making mistakes ... its never actually doing it.

Do you see a difference between the process of adding numbers and dividing by their count (taking a mean) and emitting numeric tokens which are most probable for a given input?

The former is called "taking a mean" the latter isnt. This system never engages in any method to summarise financial data. It's method is always the same: to emit tokens most probable given a set of historical tokens.

It's the difference between saying "the average of 1,2,3" is 2 because that sentence occurs 1,000,000 times and saying it's 2 because you've literally computed it.

This system does not run financial summary algorithms. It's a trick

link

sweetgiorni 960 days ago

To add to your point: try asking ChatGPT to do basic arithmetic on numbers it hasn't seen before. You'll see just how good it is at computation.

link

famouswaffles 959 days ago

It's better (GPT-4) than you could manage without an external tool or pad. and that's after being severely hampered by tokenization. https://arxiv.org/abs/2310.02989

link