> limiting bias by RLHF rather than picking the right datasets
This is the same as curation and picking out the dataset, except as post-processing. The reason why RLHF has to happen (and traumatize the people <https://www.bigtechnology.com/p/he-helped-train-chatgpt-it-t...>) is to address the problems by censoring the model.
Is it though? If you wanted to teach humans so that they don't develop unfortunate beliefs, would it be a good approach to just keep them from reading material that you find objectionable?
If you read a book that you disagree with, or one that contains falsehoods and bad reasoning as far as you can tell, would that make you believe those things?
The word "trauma" is getting overused. The idea of someone being traumatized by reading fictional text is just silly. It's unpleasant or gross at worst unless you already have other issues.
If you read a book that you disagree with, or one that contains falsehoods and bad reasoning as far as you can tell, would that make you believe those things?