|
|
|
|
|
by clolege
1286 days ago
|
|
I’m still confused by just how good its responses and writing style are. I understand that it was trained on a large data set, but I feel like some training samples must have been weighted more heavily than others. Did the training data incorporate how popular (e.g. likes or upvotes) each sample was as a proxy for quality? Or can you achieve this performance just by looking at averages on a large enough data set? |
|
See also https://en.wikipedia.org/wiki/Attention_schema_theory (unrelated to the notion of "attention" used in ML)