|
|
|
|
|
by retrac
1232 days ago
|
|
Actively "censoring" the AI is fundamental to how these language models are created. Such feedback is part of how the model learns. In a certain light, every response in training that is marked as dispreferred by a human is censoring the AI. It will produce those kinds of results less often. The end-users will not encounter the dispreferred results as frequently. With ChatGPT criteria it was judged on included how relevant the answers were to the question, factually incorrect answers were penalized, and not being blatantly offensive was obviously one of the criteria, too. What would a model that wasn't censored in training even look like? (I believe ChatGPT also has a more traditional expert system placed between the user and the language model, which flags keywords and other programmed-in patterns. That is more literally censoring the language model. But the above-mentioned issue would still exist even without such a system.) |
|
It could cite statistics without long winded disclaimers. Or be able to cite them at all.