|
|
|
|
|
by IvanAchlaqullah
782 days ago
|
|
> TruthfulQA Wait, people still use this benchmark? I hear there's a huge flaw on it. For examples, fine-tuning the model on 4chan make it scores better on TruthfulQA. It becomes very offensive afterwards though, for obvious reasons. See GPT-4chan [1] [1] https://www.youtube.com/watch?v=efPrtcLdcdM |
|