|
|
|
|
|
by simianwords
312 days ago
|
|
i don't think this is correct - such training data is usually made at SFT level after unsupervised learning on all available data in the web. the SFT level dataset is manually curated meaning there would be conscious effort to create more training samples of the form to say "i'm not sure". same with RLHF. |
|