Hacker News new | ask | show | jobs
by madenine 2395 days ago
I’m in favor of there being more and better resources to learn anything out there, but every time I see a deep learning 101 type material all I can think is “who is this for?”.

In ~July 2016 I was at a presentation by NVidia at GW in DC. They showed off how easy it was to build out and train a model using some of their tooling (Digits maybe?). After the demo they opened it up for questions and a grad student ‘asked’ “You just did in 10 mins with 30 lines of code what I worked on for an entire semester”.

That’s been the trajectory of the tools and increasing abstraction in this space. It’s just getting easier and easier to build models that work (which is great), and it gets easier and easier to do so without knowing more than an extremely high level overview of the math behind it all.

So while this looks like a great resource - who’s it for?

For jobs/problems that need you to have a thorough understanding of the math and theory behind the networks this isn’t going to cut it.

For jobs/problems that need you to get something working math or not - this likely isn’t necessary to get started.

So it’s for people that have been getting into DL but also haven’t bothered or needed to look up the math concepts?

12 comments

While this page looks nice, there is certainly a proliferation of beginner articles in this area. I think it's driven by demand, the same way gyms get tons of new members in January. As someone who sees all the hype and salaries involving ML, you dream of doing it too, so you look for articles to start out. That's the demand. After reading it, you don't get it and have no patience so you raise demand for titles like "ML for humans" or ML made easy or "Gentle intro to ML for the rest of us". Or ditch articles and watch Siraj ramble about making money with an AI startup today! I'd wager that only a small percentage of readers actually works their way through to advanced topics.

On the supply side (while TFA looks legit) people who are a few lessons ahead want to increase their visibility, start a blog/brand, make their CV stand out by showing community engagement and writing from a position of authority. This is mostly seen on Medium.

How to avoid the trap of being an eternal beginner? Accept that it will take time, be clear on your goals, try gathering a group of peers and expert guidance. Reddit and forums can be crap for this as you the beginner will gravitate towards the self proclaimed experts who may be full of shit and just play social games well, creating a blind leading the blind situation and cargo culting around terms that nobody really understands. There is a value in universities: they lay out a path, give guidance and let you work/learn together with peers. Ok, enough with this rant.

> So while this looks like a great resource - who’s it for?

I give you an analogy. Electricity. Who needs to know complex numbers and differential equations to understand electricity? Technician, civil engineer, scientist or research engineer?

Technician who just wires the house don't need math. They just read the wiring instructions and follow standard practices. Nvidia boasts about the tools it builds for 'ML technicians' in this analogy.

You need to know math if you are building new architectures and applying complex models for something nontrivial. It's not going to work first time and you need to know what's going on. Even if you are the 'civil engineer' in this analogy you should be able to read the math and understand it even if you don't do the math by yourself. You won't be able to do literary research and learn new stuff if you can't read math fluently.

If you are programmer who is given ML tools to implement something someone else designed and understand you don't need this or use existing models, you don't need this. Your career might benefit from knowing it but you can manage without.

I believe the OP's point was that the math described in the article is too simple and not enough to do any serious research. Anyone who attempts to do NN research already knows this material (and a lot more). This tutorial could be useful to someone who wanted to implement simple backprop from scratch, but all DL libraries already do it automatically. Someone who just wants to learn a bit about NNs to classify images or generate text does not need to know this, and someone who wants to make a breakthrough in NN theory already knows it. So yes it's not very clear who is the target audience here. I'm guessing it's for a bright highschooler who just learned calculus and who is interested in how NNs work. For such students I'd recommend reading http://neuralnetworksanddeeplearning.com instead.
But someone who wants to contribute to the research doesn't just have this knowledge pop into their mind out of nowhere. They're going to learn it from somewhere and what's wrong with one more resource to help out with that.
I don't think ML and NNs are at the point yet where you don't need to understand the math.
> So while this looks like a great resource - Who's it for?

Undergraduates, or graduate students who didn't happen to take the right prerequisites. Most STEM degrees require vector calculus, but few require matrix calculus. A physics undergrad might see matrix calculus if they studied general relativity, or math undergrad interested in optimization or differential geometry. A statistics major might have seen it when working with multivariate distributions and regression. But it would be easy to miss.

Nevertheless, matrix calculus, which is not in fact a large subject, but only some new notation and a handful of theorems, is the key to understanding back-propagation. It's not the only way to approach it - you could just keep track of all those subscripts and indices - but it's one of the best. The differential form[1] is particularly good to learn because it maps almost 1-1 onto the error terms in a gradient descent implementation.

[1]: https://en.wikipedia.org/wiki/Matrix_calculus#Identities_in_...

> So it’s for people that have been getting into DL but also haven’t bothered or needed to look up the math concepts?

Everyone has to start somewhere. The usual pedagogical technique to teach a subject twice: once at an "undergraduate" level, omitting the technical details of proofs, with the goal of providing a big picture intuitive understanding of the subject and some practical symbol pushing ability; then again at the "graduate" level, with more formal definitions and detailed proofs. Your own education presumably used this structure, no? Even if you've already graduated, this "two pass" approach to learning new material is still a good idea. Few of us are von Neumann, able to dive immediately into the deepest depths of theory in a new field: we can all benefit from taking the time to develop some good intuitions first.

This is where all textbooks come from - a lecturer presents the material the way that seems clearest to them. They prepare notes to keep everything straight in their own head. Sometimes they find that their presentation resonates with students and is superior to what's currently available, so they start to develop their notes into something publishable. Most such projects get abandoned before too long, but many end up in some form on the internet, and a few go on to be developed into standard texts. As long as you can find even one new way to explain things that helps students, the exercise is not in vain.

> Most STEM degrees require vector calculus, but few require matrix calculus

Gradients, Jacobians, etc are typically covered in a multivariable calculus class along with vector calculus (line integrals, Green's theorem, Stoke's theorem, etc). This is required for engineering and physics degrees.

This is a very unempahtetic, almost anti-educational, comment.

Pick any journey to any destination. This article occurs at many points along them.

Need to have a thorough understanding of math? Then this is a starting point.

Don't? then this is an endpoint.

I'd guess it's for people in the second camp (practical) who are just trying to satisfy their intellectual curiosity or want an intro to the math behind it all.

I agree with your assessment that's it doesn't have much practical use on its own, nor is it an efficient means to any particular end.

Well one of the authors (Jeremy Howard) strongly advocates for learning by writing. I think it's a good mission: there is always value in finding better ways to convey information. If you already know this material then it's easily ignored. Other people may be excited about something you already know, and that's fine. Unfortunately that makes it pop up in your news feed and sorry I have no solution for that =).

Edit: Just noticed the other author is the creator of ANTLR, which I recall using in school to write our own languages. Cool!

Wouldn't be a proper DL post without some good ol' gatekeeping.
Honestly, I was trying to convey the opposite - the gates are wide open and it’s never been easier to drive through.
I don't even know what course you'd learn matrix calculus in, but it was a necessity for my upper level ML courses. This website would have been a godsend, and would have spared TAs many hours figuring out what knowledge we were missing. We got by with the Wikipedia page...
High level computer science courses straddle several disciplines and you end up with weird stuff like computer vision being the purview of the electrical engineering department. EEs tend to know some of this stuff because a lot of matrix algebra comes up in control/optimisation theory.

In physics we did matrix calculus primarily for electromagnetism and fluid dynamics. Maxwell's equations are the first time most students see the div/curl operator and it's also used in e.g. Navier-Stokes. But even though we were taught it, I don't think we really bothered to remember what a "Jacobian" is.

A lot of this stuff also comes up in physical rendering.

Does this not count as looking up the math concepts?
It matters, in certain contexts.

For example: a large number of clustering methods boil down to matrix factorization, with variations in the constraints. If you have both domain understanding and a general understanding of what kind of output these variations are likely to result in, you can often narrow down the list of methods you need to try.

Also, interviews.

My guess is it's for people who have gotten something working in the past and are now looking to go a little deeper.
> So it’s for people that have been getting into DL but also haven’t bothered or needed to look up the math concepts?

Yes.