|
|
|
|
|
by krunck
43 days ago
|
|
> “The push to make these language models behave in a more friendly manner leads to a reduction in their ability to tell hard truths and especially to push back when users have wrong ideas of what the truth might be,” said Lujain Ibrahim at the Oxford Internet Institute, the first author on the study. People aren't much different. When society pressures people to be "more friendly", eg. "less toxic" they lose their ability to tell hard truths and to call out those who hold erroneous views. This behaviour is expressed in language online. Thus it is expressed in LLMs. Why does this surprise us? |
|