Hacker News new | ask | show | jobs
by sanchezdev 2404 days ago
I didn't mean to make it sound incidental although I do see your point. Just wanted to chime in with how important having a labeled dataset is for a successful ML project.
1 comments

I think the point is labeling itself is very difficult except for special and limited domains. Manually constructed labels, like feature engineering, are not robust and do not advance the field in general.
That makes sense. I'm coming from the angle of applied ML where solutions need to solve a business problem rather than advance the field of ML. In consulting many problems can't be solved well without a labeled dataset and in lieu of one, less credible data scientists will claim they can solve it in an unsupervised manner.
For sure. There are counter-examples however - fully unsupervised machine translation for resource poor languages comes to mind and is increasingly getting business applications.

I think that in the future, more and more clever unsupervised approaches will be the path forward in huge AI advances. We've essentially run out of labeled data for a large variety of tasks.