Hacker News new | ask | show | jobs
by martingoodson 2366 days ago
>Also, I will throw in my consiracy theory that most ML researchers and such push the theory/deep stats requirement as a form of gatekeeping.

Learning the fundamentals of a field is supposed to be gatekeeping. It's what stops you from making stupid mistakes. The field of ML is littered with horrible errors made by people who don't know the fundamentals.

Please don't follow this terrible advice.

2 comments

Doesn't it depend on what you're trying to do?

I think there's a huge difference between research and learning enough to scrap something together for a hobby project. The deep maths can come later.

I don't need to study compiler theory to use GCC.

Your analogy is wrong i.e. you are comparing apples to oranges. ML is very different from other "normal" computation systems.

* Non-ML: Input + {Rules} = Output

* ML: Input + Output = {Rules}

where "{Rules}" = Infinite set of possible "Programs" each of which is a trace through a very large state space of variables.

In the first case, we humans use all our ingenuity to write the program and tweak it to get the right results. We already know the difficulties involved in writing "correct" programs but have mastered it to some extent.

In the second case, you cannot do that. Your "Programs" are derived by the system and encoded in numbers. How in the world do you even know that your encodings are correct? This is why you need the techniques of Mathematics to transform (eg. Linear Algebra) and constrain (eg. Inferential Statistics/Probability) the output "Rules" so you can have some measure of confidence in it. This is the fundamental challenge inherent in ML.

> How in the world do you even know that your encodings are correct?

Easy, you know that they aren't and will ever be entirely correct for complex enough ML problems, just like humans. The ways to handle its errors is not an ML topic though, you just have to ensure via old fashioned system design that the system you build doesn't depend on any ML model to always output correct results.

You can say that about any field, discipline or skill.

At the same time, there is a difference whether one starts learning that, and one wants to apply it in a large, production system with social implications (be it advertising, medicine, or anything). Hobby projects, or even small startups, rarely fall in that region.

Moreover, even a profound knowledge of mathematics does not give any edge in ethics, or even - awareness of problems with real data (noise, bias, malicious use, social reception, etc).