|
|
|
|
|
by cma
424 days ago
|
|
I think this is definitely not true of catastrophic forgetting from finetuning. And with other related types of forgetting from model abliteration there are often extreme increases hallucination. The InstructGPT paper also showed that RLHF made hallucination worse (with more user data rejecting common hallucinations instruction tuning and RLHF may lower specific hallucinations rejected by users though). Some mention of that here:
https://huyenchip.com/2023/05/02/rlhf.html#rlhf_and_hallucin... |
|