Hacker News new | ask | show | jobs
by inopinatus 784 days ago
Correct. You must perceive them as plausibility engines. The unstated hypothesis is that plausibility of output may converge towards correctness of output with increasing scale and sophistication. This hypothesis remains very far from proven.
2 comments

I don't think it's that hard to understand what the hell is going on with LLMs under the hood. Ultimately it's a weighted sample of the training data. It has a relationship with reality insofar as one exists within the training data. HFRL makes it easier to believe something crazy is happening because the output is being weighted towards something that's believable to us.
Depending on what you mean by "weighted sample", that's either trivially true (the network is of course a function of its training data and nothing else) or trivially false (the network generalizes over the training data and has not memorized it). It is not a good intuition pump for why an LLM can hold up one end of a conversation, or follow novel instructions - it is not reading from a script, nor regurgitating chopped up pieces of text like a Markov chain. It is doing something very clever in a way that is not obvious.

>It has a relationship with reality insofar as one exists within the training data

This is true of anything that learns.

> this is true of anything that learns

Sure, but most things that learn have actual reality as a training set. LLMs have human curated data, which isn’t and can’t be perfectly representative of reality.

Couldn't have said it better myself.
I also think you get the best results when thinking about using them this way too, any other way of using them seems to end in disappointment.