Hacker News new | ask | show | jobs
by __sy__ 2173 days ago
To second your comment, I think non ML folks don't understand how much of an impact dataset curation can have on model performance. More high-quality data will outshine clever network architectures with less data. I've seen it again and again. But the thing is, it's so hard to really curate your data once the dataset has a lot of "dimensionality" to it (sorry couldn't think of a better word...). To be honest, if I were to pick an area of dev-tool I'm most excited about over the next 5 years, this area is probably it.
1 comments

Btw, for anyone interested, here's a good/quick talk by Andrej Kapathy on what it will take to build the next software stack. https://www.youtube.com/watch?v=y57wwucbXR8&t=3s