|
|
|
|
|
by paganel
3259 days ago
|
|
> The quality of the algo and I assume the deep learning model lies in the quality (breadth and depth) of the data, and how honest with himself the person choose to model it. I've only dabbled with machine-learning here and there for the past 10 years or so, but if there's one thing I've learned so far is that the data behind your ML code (and the way it is structured) is responsible for almost all the success or failure of any given ML algorithm. I have an younger colleague at work who I've started tutoring, and he seems really interested in doing ML work (maybe because of all of the recent hype). I've tried to emphasize to him several times that ML algorithms come and go and that he should focus a lot of his time on the data itself (from where he intends to collect it? how is it structured? is it reliable? is it "enough"? etc), but it looks that my data-related advice falls on deaf ears every time, he's only interested in me pointing to him the latest cool ML algorithm. I guess he'll live and learn, so to speak. |
|
Data is indeed a necessary condition but certainly not sufficient. You require a good marriage between engineering features and data to have a good success rate. Learning curves [0] are a good way to understand if your ML algorithm requires more data or better feature engineering.
[0] http://mlwiki.org/index.php/Learning_Curves