|
|
|
|
|
by rkagerer
306 days ago
|
|
They were all trained from the internet. Anecdotally, people are jerks on the internet moreso than in person. That's not to say there aren't warm, empathetic places on the 'net. But on the whole, I think the anonymity and lack of visual and social cues that would ordinarily arise from an interactive context, doesn't seem to make our best traits shine. |
|
Even Reddit comments has far more reality-focused material on the whole than it does shitposting and rudeness. I don't think any of these big models were trained at all on 4chan, youtube comments, instagram comments, Twitter, etc. Or even Wikipedia Talk pages. It just wouldn't add anything useful to train on that garbage.
Overall on the other hand, most stackoverflow pages are objective, and to the extent there are suboptimal things, there is eventually a person explaining why a given answer is suboptimal. So I accept that some UGC went into the model, and that there's a reason to do so, but I believe it's so broad as "The Internet" represented there.