| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by deuslovult 2234 days ago

I'm an ML engineer, and I agree with you- deep learning is by far the most common approach for new problems in informatics.

Imo deep learning is so popular because it "works". For a classification problem, if you try a linear baseline and a deep learning model, and you do a reasonable job of hyperparameter tuning and experimental design, it's likely you will outperform a simpler model. This holds true across many problem spaces.

I think the issue is that modern DL frameworks make it a little too easy to get pretty good performance on new problems. Other techniques generally require more background knowledge to make reasonable modeling assumptions, and still frequently perform worse than a naively applied DL approach.

I think DL will remain, in practice and education, a very popular tool. But it is essential to learn traditional statistical inference and other background to appropriately contextualize DL models so it isn't just some form of black magic.

1 comments

mattkrause 2234 days ago

A lot of those comparisons strike me as shaky.

It's easy to beat a naive logistic regression model with a good neural network, but the gap often closes once you start trying to tune the logistic model too. (And it's not like the neural networks aren't tuned either--architecture search, data augmentation, etc).

Recent review on medical data: https://www.sciencedirect.com/science/article/abs/pii/S08954...

link

deuslovult 2234 days ago

Logistic regression is exactly a NN with no hidden layers and a sigmoid activation function. A feedforward NN with additional layers is strictly more expressive than logistic regression.

link

mattkrause 2234 days ago

Yes! The million dollar question is how much of that expressivity is actually required.

In many papers, the "baseline" logistic regression model is very stripped down: y~logit(.) but the neural network has had its expressiveness optimized in various ways. People aren't comparing against a 3 layer feedfoward network; there's augmentation and pre-training, architecture search and special learning schemes.

My claim is that if you want to claim that a problem needs the expressivity that (only) a neural network provides, you ought to be devoting a great deal of effort to the logistic regression model too. Make it a steelman, rather than a strawman, if you will.

link