Hacker News new | ask | show | jobs
by geraltofrivia 962 days ago
That's not _completely_ true. There are a bunch of datasets (specific to some cultural/lingual contexts) like CrowsPairs[1], StereoSet[2]. There is a lot of work you can do to make sure that the model's predictions are fair as well [3]. But at the end, yes these datasets don't exist at the scale of training sets of these LLMs. Hence red-teaming and RLHF post convergence.

PS: Yes I know CrowsPairs is a dataset with a bunch of flaws. My SO is working, in a team of 10+ linguists and researchers to develop a multi-lingual, generalized version of it which also addresses multiple problems with it. Unpublished work, for now.

[1] https://github.com/nyu-mll/crows-pairs/ [2] https://arxiv.org/abs/2004.09456 [3] https://arxiv.org/pdf/2204.09591.pdf