|
|
|
|
|
by litzer
3236 days ago
|
|
As somebody who's recently starting to learn more about ML, a lot of the work of an ML engineer does seem to be automate-able (not doing research or pushing boundaries but just applying ML to some product need). For example, choosing hyperparameters, evaluating which features to collect, etc seem to be things that can be automated with very little human input. His slide on "learning to learn" has a goal of removing the ML expert in the equation. Can somebody who's more of an expert in the field comment on how plausible it is? Specifically, in the near future, will we only need ML people who do research, due to the application being so trivial to do once automated? |
|
ML works very well in bounded/closed domains like image and sound recognition. Open-domains are much more challenging.
Building predictive models from data in specialized domains often require insight, which machines cannot provide. For instance, let's say you collect a bunch of data and are trying to predict sales. You need to apply domain knowledge, experience and intuition to know what variables are causal or correlative. If you just throw all the variables into the mix and build a model from that, you will end up with a model that overfits badly.
There are automated "variable selection" techniques that can help to prevent overfit, but they are mostly imperfect because machines can only detect correlation and not causation. Also, many regression/classification techniques are easily fooled by noise and highly nonlinear relationships. We did some work a few years ago comparing predictive models built from a ton of sensor data (with automated variable selection) vs. one that was parsimonious that was built on select data that we knew accounted for 80% of the effect. The latter model was far superior. Noise/non-causal variables often don't just "wash out" even with very good variable selection algorithms.
It takes domain knowledge to figure out what variables matter and what variables don't.