Hacker News new | ask | show | jobs
by nshm 1779 days ago
Data cost plunges these days with self-supervised and semi-supervised learning. You don't need annotated and clean data anymore, there is abundance of it. Projects like Voxpopuli or Gigaspeech with 400 thousand hours (100 times more than Mozilla's) of data easily available.