Hacker News new | ask | show | jobs
by cardosof 1716 days ago
Thats for structured data, for non structured it's more like "create a NN and stack more layers until you have your MVP"
3 comments

> "create a NN and stack more layers until you have your MVP"

I mean, that's a pretty good principled approach to a lot of ML problems.

I think you have a different definition of "principled" from most people.
I'm very curious as to what part of that process is not explained by the principles by which we understand neural networks to work.

I invite the possibility I've gone this long misunderstanding the definition of "principled" in this context.

To me, taking "principled approach" means you understand and can justify the eventual outcome of the approach, or at least guarantee that the outcome satisfies some constraints. How would you justify the number of channels in each layer of a convolutional network? The number of self-attention heads in a transformer? The depth? Can you certify its prediction performance?

Yes, the "just add more layers" approach typically works (in a very narrow sense of the word "works"), but we don't really understand why. We likewise don't understand the failure modes of the system, and cannot engineer around them. Thus it's not really principled in my view.

Only because currently ML is more alchemy than engineering. We mix stuff until we make gold while we can't explain why more parameters generalize better instead of overfitting.
No, it's "load pretrained resnet and finetune on a few examples". Nobody trains from scratch today except the researchers with large budgets.
it's more like try different off-the-shelf models on some sample of data until the performance is somewhat acceptable

Unless you're Google, who even trains models from scratch these days, at most you do some fine-tuning