Hacker News new | ask | show | jobs
by terrabytes 3193 days ago
Spot on. I struggled with the mainstream deep learning/machine learning MOOCs. I felt like they were to math heavy. However, I'm struggling on how to learn deep learning. I get polarized advice on it. Some argue that you need a degree or certificates from established MOOCs, others keep recommending me to do Kaggle challenges.

Has anyone managed to land a decent deep learning job without formal CS/machine learning training? How did you approach it?

2 comments

   I felt like they were to math heavy. However, I'm struggling on how to learn deep learning.

These statements are in contention. You will never really understand machine learning without learning a fair bit of the math.

I do think a lot can be done on the presentation of the material, and certainly don't think much of credentialism.

Honestly, in your shoes I would look for a position where you can learn from people internally, rather than try and qualify yourself first. Even if you do a bunch of online learning and toy problems, you are going to flail about if you don't have a strong mentor in your first position.

What related/supportive skills do you have to bring to a group that is doing ML ?

edit: I should add that you don't really have to understand much these days to integrate (some) ML into a system, but you aren't going to get very far into modeling or understanding issues without some background. You can only get so far with black boxes.

Thanks for your reply. I do agree with you, in general, and have been trying to get myself involved in more ML projects at my current work.

I have around 8 years of professional software experience (C++/C#) and have fiddled around with some rudimentary machine learning for work, like linear regression, k-means clustering, etc. I have a decent idea of how/why they work, but have fallen flat on my face when learning the theory behind more complicated algorithms, e.g. Hessians from Andrew Ng's class. In my experience, many classes tend to focus on a ground up approach. With higher level frameworks like Keras, how necessary is this?

>With higher level frameworks like Keras, how necessary is this?

I would wager that you've heard this line before, but it all depends on the particulars of what you are trying to do. If you want to develop a first principles understanding of what's going on its probably important. It will be less important if you just need to see the empirical performance of n established method on your new dataset.

>but have fallen flat on my face when learning the theory behind more complicated algorithms, e.g. Hessians from Andrew Ng's class

Reading in between the lines, maybe this is a question about Newton's method? One of the general strategies shared between software development and "mathematical" (for lack of a better word) science and engineering is to reduce a complex problem to a known use case. If you've got a grasp on linear regression, take a look at Newton's method in this case. You may be pleasantly surprised to see that the Hessian is constant. This might make it easier to make the connection to relevant topics such as the convergence rate of the method and the connection to the uncertainty in the fit.

This is something I've also struggled with. I find it hard to read deep learning papers because I need to translate each math notation, thus struggling to get the bigger picture. I'm fond of the bottom-up approach, e.g. I started by mastering C and wrote my own libraries. But for deep learning I lean towards the opposite, starting with high-level libraries. When I want to understand the theory I search for simple python code that I can implement from scratch. This way I can understand the logic, without having to understand all the math behind it. I've mostly focused on doing Kaggle type of problems and used MOOCs when I get stuck. I've had little interest from larger companies, but I've managed to get a few offers from startups. Startups often have a couple of people with PhD-level knowledge but are also looking programmers that can code the models.
"Machine Learning Engineer" is a title we're going to see more and more of (and we're already seeing a lot).

Its one thing to know the math and theory to design, train, and tune the algorithm your company needs. But implementing it into production, at scale? That's not the same person.

Ideally, you have Person/Team A, who designs but knows enough about implementation to keep that in mind during their process, and Person/Team B who implements it into the software but knows enough about the design to make it work.

Truly ideally you have someone/team who actually can do both properly. However, very few people can. And if you have one, you may not be able to justify their time on all aspects.

So the compromise is usually as you describe, but you bear the cost of translation issues no matter how you do this. It's worth remembering that it is a compromise.

I think systems like tensorflow are implicitly a recognition of this, allowing lower impedance between the groups.

There's a difference between people who can implement models and those that can create them -- startups could use people who do the former, and many don't actually need the latter.