|
|
|
|
|
by runtime_blues
1126 days ago
|
|
Early GPTs were fairly bad at following instructions. The innovation was RLHF, where human raters (Mechanical Turk style) would be asked to evaluate on how well the LLM is able to follow instructions stated as a part of the prompt, often in this style. Countless such ratings were incorporated into the training process itself. So it did not happen out of the blue, and you didn't need a whole lot of existing webpages involving this sort of role play. |
|