Hacker News new | ask | show | jobs
by mark_l_watson 2672 days ago
I manage a machine learning team for a large financial services company and AutoML tools, Microsoft’s NNI included, are on our radar.

I think the `future of work` for machine learning practitioners will quickly separate into two groups: a very small and elite group that performs research and a much larger groups that use AutoML but whose jobs also deal more with data preparation (which gets automated also) and ML devops, supporting models in production.

3 comments

This sounds like parody to me. There are so many problems in applied statistics, and neural networks are not helpful for most of them. Consider Bayesian analysis for very small data sets as an example (just the tip of the iceberg).

In financial services in particular, there are tons of time series and regression problems on small data such that a neural network (beyond perhaps some super small MLP) would be a ridiculous thing to try.

I think the breakdown of workload you described will only happen in business departments where there is a need for large scale embedding models, enhanced multi-modal search indices, computer vision and natural language applications, and maybe a handful of things that eventually productize reinforcement learning. I could also see this happening in businesses that can benefit from synthetically generated content, like stock photography, essays / news summaries / some fiction, website generators, probably more.

What I described above is a tiny drop in the ocean of applied statistics problems that business have to solve.

It's another example of the FAANG + Bay Area Startups world versus the other 99% of Corporate America. In the latter world, most of the "machine learning" in production is traditional stuff like Random Forest, SVM, and more recently Gradient Boosting. Hell, Marketing departments across the country are still running old school decision tree (CART and CHAID) models and logistic regression models written in SAS 20+ years ago. DL/NN is a minuscule proportion of production ML in the enterprise space.
I think there is good reason that "old" machine learning models are more popular than DNN in the enerprise space. Most of the data is in the tabular format. What is more, "old" and simple decision tree or linear model are very easy to understand, deploy and are fast. There is for sure clear advantage of having even simple decision tree implemented in the system than making decisions at random.
The main reason though is that these other methods outperform neural nets in tons of different situations. Even just from an accuracy / business success metric point of view, many problems are just better solved with other classes of models, domain-specific feature engineering, etc. It will probably remain so for many decades at least.
DNN's make good features though, especially if you have time series data or lots of text.

I agree that the final model should be a randomforest/xgboost/lightgbm for typical tabular data.

I meant that extracting an intermediate layer as a feature embedding and then sticking a classical model on top of it performs worse than curating features through domain-specific expert tuning, for a ton of diverse application domains.
Deep Learning also works on very small data sets by means of embeddings. A large model trained on large data sets can be used as feature extraction tool for training for small data sets.
Re-using an existing model to generate embeddings doesn’t work well for auxiliary tasks with very small data. Even if you do no fine-tuning at all, you need to have big data sets in terms of the auxiliary task too.

For example, consider needing to train hundreds of unique small models every day, based on new customer inputs affecting causality effects for that day (I had to do this for ad forecasting in a past job).

Generating embeddings via pre-trained models essentially produced gibberish and performed far worse than custom feature engineering + simple logistic models.

I’ve seen this mentioned before, including a blog post by the fast.ai folks. Any idea where I can get details? If my tabular data set is small, what kind of embedding can I get out of it? Or is the idea that a larger data set is used for embeddings of categorical data?
Pre-trained embeddings are only helpful if they are trained on a different (ideally larger) dataset or even a different task, but with the same kind of input data. So you would need to find out where else something similar to the data in your tables appears. If some of the data is text, word embeddings may be applicable. Or if you're trying to analyze user activity by time and location, you might try to transfer what can be learned about the influence of holidays from publicly observable activity e.g. on Twitter (just a random idea that popped into my head, no guarantee that it can actually work).

Of course if all you have are numbers without context, there isn't a lot you can do to improve the situation.

I think this is mainly a thing for perception (images and sounds). Tabular data would have to match up with the training dataset, and "most" interesting tabular models are the sports of things guarded like piles of gold by the businesses that build them...
The parent did not specifically talk about NNs. As I understand it AutoML could apply to all statistical endeavours that involve estimation (classical or bayesian).
> “AutoML could apply to all statistical endeavours that involve estimation”

Yes, this is the part that sounds like parody to me. At least, as a working statistician, I can tell you that the concept of AutoML could not apply to the far majority of things I work on.

Could you give an example? I have a hard time understanding what you could mean, as Algorithm Configuration & Selection is such a general framework. If you are solely talking about the current state of the art, I would agree that techniques from AutoML do not have the generality and autonomity of an expert human.
For example, look into Chapter 5 on logistic regression from the Gelman & Hill book on hierarchical models & regression.

It walks through an example with arsenic data in wells and a problem of estimating how distance, education and some other factors relate to a person’s willingness to travel to a clean well for water.

Deciding on how to standardize the input features, how to rescale for regression coefficients to be interpretable in meaningful human units, how to interpret statistics of the fitted model to decide whether a feature is helping or hurting by adding it (since this cannot be deduced from raw accuracy metrics alone), how to interpret deviance residual plots for outlier analysis, etc.

All those things have nothing to do with changing the architecture of the model, except possibly including or excluding features, and in that example there were no hyperparameters to tune, and the inference problem would not make sense for hyperparameter tuning on raw accuracy outputs anyway, since the goal was not optimizing prediction but rather understanding impact of features that have semantic meaning in the contexf of possible policy choices that could be adopted.

By way of contrast, applying an automated subset selection algorithm to automatically choose the features would be a naive idea with likely bad results in that case, and setting up an optimization framework that would optimize over possible transformations or standardizations of the inputs seems equally dubious compared with expert, context-aware human judgment.

And this is a very trivial example. If you modify a problem like this to address causal inference goals, or add some type of cost optimization on top of it, it becomes more and more complex, but exactly in a way that a tool like AutoML can’t help with.

In other words, making an AutoML that can truly apply to all types of estimation or inference problems is no easier than solving strong AI computer vision and natural language problems entirely, since you need contextual reasoning and creative proposals for inventing features and sleuthing the goodness of fit of a certain model architecture in light of the human-level inference goal you’re trying to reach.

So you never tune hyperparameters or try different models to see which works better?
I do plenty of that, and AutoML could help with a small fraction of that.
The problem is "Applied Statistics" became "Machine Learning" which became "AI" which became "Deep Learning".

Throw away all the BS. and, yes, it's obvious.

Google, Facebook & MS already have even automated research, i.e. automated selection of a loss function, network architecture, individualized network topology etc. Amazon is not there yet. The rest of industry is still in "stone age", just "considering" using something like AutoML for basic hyperparameter tuning.
If you automate it, is it still research? Research implies some sort of hypothesis testing, yes?

I suppose OP means there will be two groups: people who use AutoML and people who try to make AutoML better.

There should be at least 3 groups, because making AutoML better != making ML better.
Why? The concept of AutoML does include the design of novel algorithms.
What do you mean? I thought AutoML was a tool to do neural architecture search, and hyperparameter tuning.
The field of automatic machine learning (abbreviated as AutoML) concerns all endeavours to automate the process of machine learning. To provide a sense of what could constitute AutoML, let me post a list from the "Call for Papers" of the International Workshop on Automatic Machine Learning (ICML 2018) [1]:

    * Model selection, hyper-parameter optimization, and model search
    * Neural architecture search
    * Meta learning and transfer learning
    * Automatic feature extraction / construction
    * Demonstrations (demos) of working AutoML systems
    * Automatic generation of workflows / workflow reuse
    * Automatic problem "ingestion" (from raw data and miscellaneous formats)
    * Automatic feature transformation to match algorithm requirements
    * Automatic detection and handling of skewed data and/or missing values
    * Automatic acquisition of new data (active learning, experimental design)
    * Automatic report writing (providing insight on automatic data analysis)
    * Automatic selection of evaluation metrics / validation procedures
    * Automatic selection of algorithms under time/space/power constraints
    * Automatic prediction post-processing and calibration
    * Automatic leakage detection
    * Automatic inference and differentiation
    * User interfaces and human-in-the-loop approaches for AutoML
[1] https://sites.google.com/site/automl2018icml/
Hasn't this always been the case? Actually fitting a model has always been a pretty small part of an applied statistician's job. The real work is everything before and after that point.