|
My opinion is that the theory starts to make sense after you know how to use the models and have seen different models produce different results. Very few people can read about bias variance trade off and in the course of using a model, understand how to take that concept and directly apply it to the problem they are solving. In retrospect, they can look back and understand the outcomes. Also, most theory is useless in the application of ML, and only usefull in the active research of new machine learning methods and paradigms. Courses make the mistake of mixing in that useless information. The same thing is true of the million different optimizers for neural networks. Why different ones work better in different cases is something you would learn when trying to squeeze out performance on a neural network. Who here is intelligent enough to read a bunch about SGD and optimization theory (Adam etc), understand the implications, and then use different optimizers in different situations? No one. I'm much better off having a mediocre NN, googling, "How to improve my VGG image model accuracy", and then finding out that I should tweak learning rates. Then I google learning rate, read a bit, try it on my model. Rinse and repeat. |
The best approach is to learn both concurrently. Learn some theory, apply it and understand that applications including pitfalls, then learn a bit more and repeat. Incremental learning with a solid base. It's fun to hate on academia but this is how experts with deep knowledge of a domain get to where they are.