| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mjburgess 961 days ago

at 500gb, you can store nearly everything ever written -- let alone compressed.

all statistical learning is a variation on k-nn (see the relevant paper on this) but likewise this is obvious a priori

k-nn is the ideal learner, and a good starting point for analysis

the question for any given system is: what is the learning space, what is the distance function, and how many points are being considered

NNs set up a compressed X,y space, in that space choose points via an empirical expectation, and obtain a weighted average as their prediction

That's just what they do -- there isn't any other mechanism here. The whole formal structure of the NN can be written down on a page of paper

your paper above doesn't deal with this -- it's a reply to the 'forced interpolation' view, which i haven't espoused. but often NNs are forced interpolated

'extrapolation' is of course a part of the possible predictive output of a statical learning system -- in that it's latent space is taken to be embedded in R^n and so one can 'veer off' into R.

Whenever you attribute a higher fidelity space to a small latent space you are, in effect, extrapolating

1 comments

famouswaffles 961 days ago

>at 500gb, you can store nearly everything ever written -- let alone compressed.

No you cannot.

>That's just what they do -- there isn't any other mechanism here.

That's not what they do. They are many papers now showing ICL demonstrating some kind of optimization method during inference which would not be happening if all they did was retrieval.

I'm come to realize you don't know what you're talking about. Your level of denial is scary to see.

link

mjburgess 961 days ago

just do the calculation yourself: how many books is 500gb at, say, a few bits per character?

more than all every written -- and so on

perhaps apply a single drop of scepticism to this credulity

even, just ask chatgpt to repeat the first paragraph of some book -- say, a dickens novel

link

famouswaffles 960 days ago

>how many books is 500gb at, say, a few bits per character?

In what reality is a character taking up only a few bits ? Certainly isn't this one.

Your denial is so weird. Why does ICL use Higher-Order Optimization Methods when it's just a lookup table ? https://arxiv.org/abs/2310.17086

How do you explain small language models that have much less space than the text they were trained on even with your nonsensical calculations ?

LLMs are not lookup tables and there's plenty evidence to support that. You look insane insisting they are.

>even, just ask chatgpt to repeat the first paragraph of some book -- say, a dickens novel

Not only will this not work for the vast majority of books it sees during training, Why only the first paragraph ? are you not insisting they memorize everything ? Why can't it repeat the whole page ? the whole chapter ? the whole book ?

link

lbalazscs 960 days ago

I actually did that, and I asked for the first paragraph from "Great Expectations", and the answer was correct.

Of course the first paragraph of any famous novel is in itself famous, so I also asked for the first paragraph from Chapter XL, from the same novel. Interestingly the first 12 words were correct, but the rest was hallucination.

Original: It was fortunate for me that I had to take precautions to ensure (so far as I could) the safety of my dreaded visitor; for, this thought pressing on me when I awoke, held other thoughts in a confused concourse at a distance.

ChatGPT 3.5: "It was fortunate for me that I had to take precautions to ensure the safety of my treasure, for with such a boy as Compeyson on the loose, I was not likely to find much mercy at his hands. I had not told Herbert about the money, and I was glad to have the chance to count it before he came back. I found that I had still twenty-one sovereigns, six shillings, and sixpence left."

link