| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hallqv 1062 days ago
	RLHF is not part of LLaMa pretraining, or pretraning of any other models for that matter. RLHF comes after pretraining. https://twitter.com/Jeande_d/status/1661833563069620247/phot...

2 comments

jsmith45 1062 days ago

Seems like a classic case of a term of art overlapping with normal English terminology.

Knowing that you will be doing further training on a provided model (even "just" extensive fine-tuning), one would want to distinguish the training done before you get your hands on it, from the training you do. An obvious word for that previous training is pre-training, which unfortunately conflicts with a term of art.

link

jawerty 1062 days ago

I see, that’s my misunderstanding I was grouping all training as pretraining

link

wilhelm____ 1061 days ago

pre-training is developing the language model's base understanding of conditional word probabilities.

SFT and RLHF is attempting to further guide the model in terms of steerability + alignment of output.

In fact, the InstructGPT authors were worried about losing the pre-trained model's underlying probability distribution, so they try a version where it penalizes the model deviating too significantly from the original distribution (using KL). I don't remember them seeing a significant difference in performance.

link