| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by inopinatus 784 days ago
	Correct. You must perceive them as plausibility engines. The unstated hypothesis is that plausibility of output may converge towards correctness of output with increasing scale and sophistication. This hypothesis remains very far from proven.

2 comments

__loam 784 days ago

I don't think it's that hard to understand what the hell is going on with LLMs under the hood. Ultimately it's a weighted sample of the training data. It has a relationship with reality insofar as one exists within the training data. HFRL makes it easier to believe something crazy is happening because the output is being weighted towards something that's believable to us.

link

dTal 784 days ago

Depending on what you mean by "weighted sample", that's either trivially true (the network is of course a function of its training data and nothing else) or trivially false (the network generalizes over the training data and has not memorized it). It is not a good intuition pump for why an LLM can hold up one end of a conversation, or follow novel instructions - it is not reading from a script, nor regurgitating chopped up pieces of text like a Markov chain. It is doing something very clever in a way that is not obvious.

>It has a relationship with reality insofar as one exists within the training data

This is true of anything that learns.

link

dartos 783 days ago

> this is true of anything that learns

Sure, but most things that learn have actual reality as a training set. LLMs have human curated data, which isn’t and can’t be perfectly representative of reality.

link

__loam 783 days ago

Couldn't have said it better myself.

link

bamboozled 783 days ago

I also think you get the best results when thinking about using them this way too, any other way of using them seems to end in disappointment.

link