Hacker News new | ask | show | jobs
by achompas 3031 days ago
> The obsolete gatekeeping by the AI/ML elites who say "you can't use AI/ML unless you have a PhD/5 years research experience" is one of the things I really hate about the industry.

It is absolutely true that you do not need a graduate degree to apply AI/ML to vanilla problems.

It is also absolutely true, in my experience, that you need a graduate-level education or years of hands-on experience to troubleshoot cases where AI/ML fails on a deceptively-simple problem, or to tweak an AI/ML algorithm (or develop a new one) so it can solve a novel problem.

That said, I think these MOOCs are good enough to get someone to a place where they can create nice /r/dataisbeautiful-style visualizations, or pair with a senior-level DS to deliver something.

(Edited to add folks who have worked on problems for years and add a final note.)

3 comments

> It is also absolutely true, in my experience, that you need a graduate-level education or years of hands-on experience to troubleshoot cases where AI/ML fails on a deceptively-simple problem, or to tweak an AI/ML algorithm (or develop a new one) so it can solve a novel problem.

How much of that is critical domain specific knowledge and how much of that is just general engineering debugging/problem solving experience though? Certainly the person who does have the masters/PhD and a few years of applying that to real-world ML problems will have the edge but an experienced developer who's got a knack for maths (though no direct ML experience) may be able to get up to speed quicker than you think. Part of that will be experience with knowing how and when to ask the right questions when you get stuck.

> How much of that is critical domain specific knowledge and how much of that is just general engineering debugging/problem solving experience though?

It's both, right? You pick up problem-solving techniques as a researcher or engineer; as the former, those techniques lean towards scientific problems. Your average engineer doesn't need to know about contrasting.

Again: it's possible to learn the necessary math in your spare time! I agree!! However, it's far easier to do it in a graduate program as a full-time job for 2-5+ years.

The knack for maths is the important bit.
The math necessary for ML/AI (statistics/vector calculus) is mostly taught at undergrad level though isn't it? So most engineers should already have it covered.
I can't help but think in 3-5 years how quaint our tools of the day will seem.
I think about this constantly.

Not to sound like I walked uphill in both directions back in my day or something, but I remember building models in numpy without pandas. It was tedious -- and that's just a nice API wrapping ndarrays!

> Not to sound like I walked uphill in both directions back in my day

Local minima?

Most likely, gradient descent with momentum.
Oh boy, that and perturbation.
I’m not so sure.

You can make an argument current tools haven’t really surpassed a Lisp Machine for developer productivity, or a SmallTalk environment.

Not really. I see things like leftpad and npm fails and CEOs mailing private keys they've stored from customers. I see the same lessons we have have to re-learn year after year.
What's an example of a problem that needs that troubleshooting? (Curious)
Honestly? The exact problem I'm dealing with at work right now.

We're trying to re-write our recommender for artist music stations at iHeartRadio (aka "I'll listen to Drake or Kendrick Lamar's station at the gym today"). Just today, I tried adding negative sampling to the matrix I'm factorizing, hoping it encourages spread in the embeddings learned for artists in certain types of genres.

I have a MS, but not a lot of research experience. It would have taken me a while to find this solution on my own. However, the moment I described this problem to my manager - a PhD graduate with several years of research and industry experience - he immediately suggested negative sampling.

What I learned during my MS helped me grok the math immediately. We're adding noise to the training set and penalizing vectors lengths to avoid overfitting. Easy! Identifying a solution worth exploring? Not easy, at least without a degree or significant experience.

(There's also the chance I should know this, in which case I have some reading to do. ¯\_(ツ)_/¯)

Aren't you sort of glossing over the fact that he is in high up machine learning position at a company that specializes in recommender systems? Doesn't that by itself increase the likelihood that he deeply understands implicit and explicit matrix factorization?

I am a good ways through my masters (second CS degree, first specializing in ML), and the more I learn, the more I realize that on any given topic, there is no guarantee the PhD in the room has the most expertise. Machine learning is a broad field that contains many subfields, methodologies, and many applications. It is a bit like computer systems or software engineering: nobody knows it all, people who are experts have intimate knowledge of a specific subset of the field. Of course, you can more around over time, but it takes years to build up expertise in even two or three subfields of machine learning.

Side note: sounds like we do similar work. I work at Vevo, also do a lot of matrix factorization to learn latent factors of items such as artists, videos, etc.

> Aren't you sort of glossing over the fact that he is in high up machine learning position at a company that specializes in recommender systems? Doesn't that by itself increase the likelihood that he deeply understands implicit and explicit matrix factorization?

Sure thing, but someone in that position needs years of experience in recommender systems, as well as working with researchers.

Folks are hanging on to the PhD part of my claim, instead of the "PhD or experience" part. The fact is, a PhD + prior industry work means the person is close to a decade of relevant background, grad degree or not. They will unstick a co-worker far faster than an experienced backend developer with, say, a year of Keras experience.

> Side note: sounds like we do similar work. I work at Vevo, also do a lot of matrix factorization to learn latent factors of items such as artists, videos, etc.

Seems like it! Email me if you'd like to chat some more offline (it's in my profile).

That has little to do with a PhD, it's the kind of thing you get with experience leading to a deeper understanding.

3D programming started as a field where only PHD's had any deep understanding of what was going on simply because they had experience when nobody else did. You see this pattern repeated frequently, in any complex domain.

Yeah, I expected this reply.

The PhD is sufficient but not necessary here, right? A PhD researcher's job description is basically "learn necessary math, become a domain expert, and publish papers advancing that domain." It's difficult (but possible) to gain the same experience in industry if you don't have a graduate degree. Which company would pay you to work through Bishop or Goodfellow for a few months? Even a principal DS doesn't get that deal, much less a junior/associate.

Also remember: my comment addressed non-vanilla cases. In your example, this is the difference between a researcher advancing 3D programming and someone using Unity or Unreal.

(Also, sorry for all the edits. Done now!)

I would say PHD is sufficient to advance the field. That's no small thing, but only really overlaps at the start when just about anything advances the field and you need a broad focus.

Machine leaning for sorting peas at high speed is a very well trodden area at this point with a lot of industry specific domain knowledge. I expect self driving cars for example to reach a similar state in ~10-25 years.

The risk with a PHD is you miss the specific wave. But, if you want to stay on the bleeding edge it's probably well worth it.

> I would say PHD is sufficient to advance the field.

Yep! We’ve now made our way back to my initial point in response to OP. :)

You can spend many months working through papers and books without a company paying you for that. That's something that I continually do and have always done, in my own time (and many different fields). Sufficient and not necessary indeed.
It's definitely easier to do when it's your primary job.
ali rahimi alludes to the problem of google engineers simply needing to tweak models that were previously tuned by google researchers who do have well-developed intuition [0]. because the intuitions in explicit form are at best heuristic and not necessarily even consistent, signing up to improve a model without them might result in spending indefinite time and compute resources without guarantee of positive results. which is a terrible perf-theoretic strategy...

[0] http://www.argmin.net/2018/01/25/optics/

Model divergence, nonsense predictions. The whole black art of ML (specifically neural nets) is coaxing them into working.

If you take some sophisticated deep neural net and try to train it on a binary classification where tails occurs 99% of the time - unless you specifically take measures to correct for this bias - the net will just learn to predict tails.

Fairness and fighting adversarial examples come to mind.