| HN Mirror

I suspect it's overblown today. Hopefully it'll be overblown indefinitely.

However, if AIs become as successful as Nvidia stock price implies, it could indeed become difficult to find text that is guaranteed to not be AI. It is conceivable that in 20 years it will be very difficult to generate a training set at any scale that isn't 90% already touched by AIs.

Of course, it's conceivable that in 20 years we'll have AIs that don't need the equivalent of millennia of training to come up to their full potential. The problem is much more tractable if one merely needs to produce megabytes of training data to obtain a decent understanding of English rather than many gigabytes.