Hacker News new | ask | show | jobs
by jorgemf 2861 days ago
How big are your datasets?
1 comments

A few petabytes in some cases. Some advanced balanced sampling in Spark must be used for testing the models.