Hacker News new | ask | show | jobs
by vinnymac 1523 days ago
In other words, the most valuable dataset (the majority of the data) is not yet searchable.
1 comments

I expect post-2020 is the majority of the data given Reddit's exponential growth. But definitely pre-2020 is more valuable.
Any data to suggest that the past 2 years have generated the majority of reddit's data? I've been a user for a decade and it doesn't feel like this is the case to me.