Hacker News new | ask | show | jobs
by visarga 1107 days ago
Do we know for sure which LLMs have used reddit comments in training? I want to know if my comment history is in the corpus.
1 comments

Yes, absolutely. Sam Altman has come out and said it, although specifically he said that social media wasn't of any particular importance for training data.

This can also be seen when you mention davidjl, who was a user super into r/counting. There was a thread of that yesterday I believe.

OpenAI thanks you for your Reddit contributions.