Hacker News new | ask | show | jobs
by cratermoon 1103 days ago
> limiting bias by RLHF rather than picking the right datasets

This is the same as curation and picking out the dataset, except as post-processing. The reason why RLHF has to happen (and traumatize the people <https://www.bigtechnology.com/p/he-helped-train-chatgpt-it-t...>) is to address the problems by censoring the model.

2 comments

Is it though? If you wanted to teach humans so that they don't develop unfortunate beliefs, would it be a good approach to just keep them from reading material that you find objectionable?

If you read a book that you disagree with, or one that contains falsehoods and bad reasoning as far as you can tell, would that make you believe those things?

A reminder that LLM transformers aren't humans, they don't learn the way humans learn.
The word "trauma" is getting overused. The idea of someone being traumatized by reading fictional text is just silly. It's unpleasant or gross at worst unless you already have other issues.