| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kewp 1202 days ago
	are these LLMs just answering the question "if you found this text on the internet (the prompt) what would most likely follow" ?

4 comments

Enginerrrd 1202 days ago

In essence, yes I think, but... isn't that essentially not much different than what I'm doing in making this comment?

link

sebzim4500 1202 days ago

That's how they are trained initially, but the resulting model isn't all that useful (was SOTA two years ago but this field moves fast).

A lot of the utility comes from the later finetuning. You can see this using the examples from the article, every mistake they identify with GPT-3 (which is the unfinetuned version) is answered correctly by chatGPT, which has gone through an extensive finetuning process called RLHF.

link

colechristensen 1202 days ago

Yes, they are being trained, to simplify, to complete sentences. You can then use the resulting model to do lots of things.

How you train a model and the inference jobs it can do don't necessarily have to be the same.

link

astrange 1202 days ago

That's how the text decoder works, but the model gets to define "most likely" and an RLHF model uses this to make the text decoder produce useful answers instead.

link