Hacker News new | ask | show | jobs
by riotnrrd 2116 days ago
The purpose of that step is exactly the opposite: to use data that represents (as much as possible) the full generality of the space you seek to model. And on a more practical level, there are datasets that can't be used for commercial purposes but are useful for research (a well known but not the best example would be ImageNet).