| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sk11001 918 days ago
	This isn't how LLMs work - they're trained to be helpful and give correct responses. You could argue to what extent that's achieved but the point is that it's not a weighted average of the internet, it's been fine-tuned towards correctness and helpfulness.

2 comments

BillyTheMage 918 days ago

I thought one of the biggest arguments against them was the fact that they inherently aren't good at being correct? They just produce something that looks like a human wrote it real good.

link

sk11001 918 days ago

> They just produce something that looks like a human wrote it real good.

That's what the base model does. To get an LLM assistant there are additional training phases to make it conversational, helpful and more correct - https://www.youtube.com/watch?v=bZQun8Y4L2A

link

yawpitch 918 days ago

Sure… and so far there always appears to be a way of breaking that fine tuning; see the the recent paper on training data extraction I linked in another comment below.

link

sk11001 918 days ago

There's a big difference between being breakable and being representative of the web content used for training like you claimed earlier.

link

yawpitch 918 days ago

Again, see the link… get it to repeat the same word and it will give you back its raw training data. We’re still discovering the (potentially limitless) ways these things can be tricked into regurgitating what they were trained on; it’s entirely possible there’s no way to stop them doing so.

link