LLMs were invented at least five years ago (BERT) though you could make the case for a few years earlier. My guess is the majority of Reddit users are new since then, not 0.1%?
Your guess is that the majority of Reddit users have joined since 2018? 1) I do not think that is correct, 2) the mere existence of LLMs isn't public awareness about how LLMs are trained, and 3) you know exactly what I'm saying and that 99.9% might be slight hyperbole.
> Your guess is that the majority of Reddit users have joined since 2018?
It's not really important to the debate around unlicensed use of copyrighted works to train AI models, but it wouldn't surprise me at all if the majority of Reddit users have joined since 2018. It's tough to get reliable active user counts, but they seem to have risen substantially over the past five years.
It also wouldn't surprise me if the majority of Reddit users were indeed from prior to 2018, but at the very least > 2018 would be a very substantial minority.
[1] https://www.kaggle.com/datasets/ehallmar/reddit-comment-scor...