Hacker News new | ask | show | jobs
by simonw 6011 days ago
10 or 20 years ago there wasn't nearly as much readily available data to be mined. Today even moderately high traffic sites generate GBs of log files a day, not to mention the enormous quantity of high value data available through various APIs.
1 comments

you don't actually need all of the traffic to make meaningful conclusions. Tracking a statistically sound random sampling of user sessions provides most of the benefit for pattern analysis uses.