| tldr; The experts make it look impossible but the masters make it look easy and actually teach you Just ranting out here so feel free to ignore :-) This is not really aimed at the author but towards the "elite" group. There was another elite commentator in one of the other thread who said he dropped out of ML class because This course included gems such as "if you don't know what a derivative is, that is fine" and he thought math was important in ML. Before the ML class I could not even argue with these guys because I did not know squat about AI and talking to these experts their advice was to take a year off and learn math and then start learning AI which in my case was not possible. Today after a couple of months of online classes I am actually using ML in my daily work and its not magic that only the elite with deep profound math knowledge can use. Another programmer who is working in khan academy actually had a blog post about how he implemented ML by learning from Prof Andrew's class now that is real world impact. I may be missing something but can one of you experts please explain why you need deep math knowledge when the professor who has been doing a lot of research in this field a lot more than you does not think so ?. The professor in his classes keeps reassuring that even after using it for so many years he has difficulty in the subject but I'm guessing these experts know it all :-). This is the reason in my opinion even though wall street is full of smart people they do not care about the rest of the population or the general masses the attitude is we are smart and we can do what we want you guys are dumb and deserve what you get and if someone outside of their elite group starts talking their language they do not like it. On a similar note when you look at the people complaining about khan academy most of them are these so called smart people. Let me talk about my background I have been working as a programmer for around 11 years , no math background though thought myself math by using Khan academy and before my layoff (now am working on my own startup ) used to make 90K (in a southern state). So guys you are not the center of the world we are crashing into your fraternity you are no longer the only experts who can talk about ML , the guys at stanford are smarter than you and know what they are doing and FYI they don't need you its the other way around. Another interesting thing is that mostly the current students seem to agree with the author, If you are smart you should probably take the effort to learn more rather than asking them to tailor the classes to what you think matters .
Also ask yourself this question if you were the Professor what do you think is more satisfying teaching 40 full time students or 20000 who are in the field already and make more impact in the field ?. |
And then something began to happen the more I learned. I no longer thought I need naive bayes this, Decision tree that, random forest there or whatever. I thought I need this concept from statistics or that idea from information theory, i just need to group and count there and that loss function is useful here. So I could come up or modify something to my need. As I go long I am finding that while before I looked for an excuse to use something fancy sounding now I prefer to go as simple as possible - but without having gone through the hard stage I could not appreciate where the simpler solution is better.
I also learned a great deal of differential calculus when implementing an automatic differentiator (a backpropogating Neural net is basically just a special case of reverse auto diff). Its fast can work with decent sized vectors (10^5 - 10^6 entries I tested) and can do gradients, hessians and jacobians of arbitrary functions. I also expect that I can easily extend it to be able to work with tensors although I haven't needed them yet. Using it I wrote a stochastic gradient descent algorithm and can plug in arbitrary loss functions and a whole bunch of algorithms just merge. I could also easily write say L-BFGS for it. Neural networks, logistic, linear regression, support vector were basically just swapping out one line.
This flexibility is what you gain.
===========================
In the below fn is an arbitrary mathematical function such as
example of a loss function