Seems like a classic case of a term of art overlapping with normal English terminology.
Knowing that you will be doing further training on a provided model (even "just" extensive fine-tuning), one would want to distinguish the training done before you get your hands on it, from the training you do. An obvious word for that previous training is pre-training, which unfortunately conflicts with a term of art.
pre-training is developing the language model's base understanding of conditional word probabilities.
SFT and RLHF is attempting to further guide the model in terms of steerability + alignment of output.
In fact, the InstructGPT authors were worried about losing the pre-trained model's underlying probability distribution, so they try a version where it penalizes the model deviating too significantly from the original distribution (using KL). I don't remember them seeing a significant difference in performance.
Knowing that you will be doing further training on a provided model (even "just" extensive fine-tuning), one would want to distinguish the training done before you get your hands on it, from the training you do. An obvious word for that previous training is pre-training, which unfortunately conflicts with a term of art.