Hacker News new | ask | show | jobs
by macintux 17 days ago
> The ones that were still under copyright are a different matter.

Given the sheer volume of information posted to the Internet in the last 40-50 years, I'd wager that covers 80% or more of the relevant input data.

Old text is relatively scarce in the grand scheme of things.

But I have no real clue, just spitballing.