Hacker News new | ask | show | jobs
by minimaxir 3032 days ago
Looking through the topics covered, the standard AI-course caveats (https://news.ycombinator.com/item?id=16247629) apply.

Yes, AI/ML MOOCs teach the corresponding tools well, and the creation of new tools like Keras make the field much more accessible. The obsolete gatekeeping by the AI/ML elites who say "you can't use AI/ML unless you have a PhD/5 years research experience" is one of the things I really hate about the industry.

However, contrary to the thought pieces that tend to pop up, taking and passing a crash course doesn't mean you'll be an expert in the field (and this applies for most MOOCs, honestly). They're very good for learning an overview of the technology, but nothing beats applying the tools on a real-world, noisy dataset, and solving the inevitable little problems that crop up during the process.

Reviewing the Keras documentation (https://keras.io) and examples (https://github.com/keras-team/keras/tree/master/examples) are honestly much better teachers of AI/ML than any MOOC, in my opinion.

(Of course, Keras is now a part of TensorFlow, so there's a neat Google vertical intergration with this crash course!)

16 comments

> The obsolete gatekeeping by the AI/ML elites who say "you can't use AI/ML unless you have a PhD/5 years research experience" is one of the things I really hate about the industry.

It is absolutely true that you do not need a graduate degree to apply AI/ML to vanilla problems.

It is also absolutely true, in my experience, that you need a graduate-level education or years of hands-on experience to troubleshoot cases where AI/ML fails on a deceptively-simple problem, or to tweak an AI/ML algorithm (or develop a new one) so it can solve a novel problem.

That said, I think these MOOCs are good enough to get someone to a place where they can create nice /r/dataisbeautiful-style visualizations, or pair with a senior-level DS to deliver something.

(Edited to add folks who have worked on problems for years and add a final note.)

> It is also absolutely true, in my experience, that you need a graduate-level education or years of hands-on experience to troubleshoot cases where AI/ML fails on a deceptively-simple problem, or to tweak an AI/ML algorithm (or develop a new one) so it can solve a novel problem.

How much of that is critical domain specific knowledge and how much of that is just general engineering debugging/problem solving experience though? Certainly the person who does have the masters/PhD and a few years of applying that to real-world ML problems will have the edge but an experienced developer who's got a knack for maths (though no direct ML experience) may be able to get up to speed quicker than you think. Part of that will be experience with knowing how and when to ask the right questions when you get stuck.

> How much of that is critical domain specific knowledge and how much of that is just general engineering debugging/problem solving experience though?

It's both, right? You pick up problem-solving techniques as a researcher or engineer; as the former, those techniques lean towards scientific problems. Your average engineer doesn't need to know about contrasting.

Again: it's possible to learn the necessary math in your spare time! I agree!! However, it's far easier to do it in a graduate program as a full-time job for 2-5+ years.

The knack for maths is the important bit.
The math necessary for ML/AI (statistics/vector calculus) is mostly taught at undergrad level though isn't it? So most engineers should already have it covered.
I can't help but think in 3-5 years how quaint our tools of the day will seem.
I think about this constantly.

Not to sound like I walked uphill in both directions back in my day or something, but I remember building models in numpy without pandas. It was tedious -- and that's just a nice API wrapping ndarrays!

> Not to sound like I walked uphill in both directions back in my day

Local minima?

Most likely, gradient descent with momentum.
Oh boy, that and perturbation.
I’m not so sure.

You can make an argument current tools haven’t really surpassed a Lisp Machine for developer productivity, or a SmallTalk environment.

Not really. I see things like leftpad and npm fails and CEOs mailing private keys they've stored from customers. I see the same lessons we have have to re-learn year after year.
What's an example of a problem that needs that troubleshooting? (Curious)
Honestly? The exact problem I'm dealing with at work right now.

We're trying to re-write our recommender for artist music stations at iHeartRadio (aka "I'll listen to Drake or Kendrick Lamar's station at the gym today"). Just today, I tried adding negative sampling to the matrix I'm factorizing, hoping it encourages spread in the embeddings learned for artists in certain types of genres.

I have a MS, but not a lot of research experience. It would have taken me a while to find this solution on my own. However, the moment I described this problem to my manager - a PhD graduate with several years of research and industry experience - he immediately suggested negative sampling.

What I learned during my MS helped me grok the math immediately. We're adding noise to the training set and penalizing vectors lengths to avoid overfitting. Easy! Identifying a solution worth exploring? Not easy, at least without a degree or significant experience.

(There's also the chance I should know this, in which case I have some reading to do. ¯\_(ツ)_/¯)

Aren't you sort of glossing over the fact that he is in high up machine learning position at a company that specializes in recommender systems? Doesn't that by itself increase the likelihood that he deeply understands implicit and explicit matrix factorization?

I am a good ways through my masters (second CS degree, first specializing in ML), and the more I learn, the more I realize that on any given topic, there is no guarantee the PhD in the room has the most expertise. Machine learning is a broad field that contains many subfields, methodologies, and many applications. It is a bit like computer systems or software engineering: nobody knows it all, people who are experts have intimate knowledge of a specific subset of the field. Of course, you can more around over time, but it takes years to build up expertise in even two or three subfields of machine learning.

Side note: sounds like we do similar work. I work at Vevo, also do a lot of matrix factorization to learn latent factors of items such as artists, videos, etc.

> Aren't you sort of glossing over the fact that he is in high up machine learning position at a company that specializes in recommender systems? Doesn't that by itself increase the likelihood that he deeply understands implicit and explicit matrix factorization?

Sure thing, but someone in that position needs years of experience in recommender systems, as well as working with researchers.

Folks are hanging on to the PhD part of my claim, instead of the "PhD or experience" part. The fact is, a PhD + prior industry work means the person is close to a decade of relevant background, grad degree or not. They will unstick a co-worker far faster than an experienced backend developer with, say, a year of Keras experience.

> Side note: sounds like we do similar work. I work at Vevo, also do a lot of matrix factorization to learn latent factors of items such as artists, videos, etc.

Seems like it! Email me if you'd like to chat some more offline (it's in my profile).

That has little to do with a PhD, it's the kind of thing you get with experience leading to a deeper understanding.

3D programming started as a field where only PHD's had any deep understanding of what was going on simply because they had experience when nobody else did. You see this pattern repeated frequently, in any complex domain.

Yeah, I expected this reply.

The PhD is sufficient but not necessary here, right? A PhD researcher's job description is basically "learn necessary math, become a domain expert, and publish papers advancing that domain." It's difficult (but possible) to gain the same experience in industry if you don't have a graduate degree. Which company would pay you to work through Bishop or Goodfellow for a few months? Even a principal DS doesn't get that deal, much less a junior/associate.

Also remember: my comment addressed non-vanilla cases. In your example, this is the difference between a researcher advancing 3D programming and someone using Unity or Unreal.

(Also, sorry for all the edits. Done now!)

I would say PHD is sufficient to advance the field. That's no small thing, but only really overlaps at the start when just about anything advances the field and you need a broad focus.

Machine leaning for sorting peas at high speed is a very well trodden area at this point with a lot of industry specific domain knowledge. I expect self driving cars for example to reach a similar state in ~10-25 years.

The risk with a PHD is you miss the specific wave. But, if you want to stay on the bleeding edge it's probably well worth it.

You can spend many months working through papers and books without a company paying you for that. That's something that I continually do and have always done, in my own time (and many different fields). Sufficient and not necessary indeed.
ali rahimi alludes to the problem of google engineers simply needing to tweak models that were previously tuned by google researchers who do have well-developed intuition [0]. because the intuitions in explicit form are at best heuristic and not necessarily even consistent, signing up to improve a model without them might result in spending indefinite time and compute resources without guarantee of positive results. which is a terrible perf-theoretic strategy...

[0] http://www.argmin.net/2018/01/25/optics/

Model divergence, nonsense predictions. The whole black art of ML (specifically neural nets) is coaxing them into working.

If you take some sophisticated deep neural net and try to train it on a binary classification where tails occurs 99% of the time - unless you specifically take measures to correct for this bias - the net will just learn to predict tails.

Fairness and fighting adversarial examples come to mind.
Unless you work for a company obviously known for their ML the "expertise" out there right now is brutal. People are building recommendation engines without knowing the very, very, very basics like Jaccard indexes, ROC Curves, or topic drift. I've even had to explain type two error to someone working on one of these before.

I agree with your general thrust, and you're right, messy data is often 95% of the problem, but even going through just the Google courses will put people in the top 15% in most cities.

I took a machine learning graduate-level course from Andrew Ng himself, and I don't recall learning about Jaccard indexes or topic drift. Maybe your sense of what counts as "very, very, very basic" is skewed toward your own experience. There's a phenomenon known to psychologists where people tend to think that the stuff that they know is very easy and basic, so they conclude that anybody who doesn't know what they know must be uneducated. But then it turns out that the person you think is uneducated knows about a bunch of surprising stuff that you don't. I can't remember the term for this phenomenon, but I often remember it whenever I find myself beginning to judge another person's expertise. This phenomenon is also super relevant to the failings of most technical interviews, in my opinion.
There's a bit of snobbiness in different areas of tech, although there also are in different areas of academia and research. At the end of the day, the most successful people are the ones who wouldn't dismiss a DS who didn't know "Jaccard index" or "the Halting Problem".
Are you referring to the Curse of Knowledge?

https://en.wikipedia.org/wiki/Curse_of_knowledge

What you are describing also sounds a little like the Dunning-Kruger effect:

https://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect

Oh, wish I knew the name of that phenomenon as well.
You probably can't communicate effectively. If you are describing "Type two error" of course you will get eyes glossing over. A huge problem with research fields is their terse banal labels. Confusion matrix anyone?
Or you can just say "false negative", and every CS major will understand you.

I find people in Math and CS have often very different names for the same type of concepts and they could easy understand each other if they stuck to the more common terms.

In this case, saying: TYPE 2 ERROR, makes you look like you are trying too hard.

It's also extremely confusing because very few people remember type 1 vs 2 but false positive/negative has intuitive understanding.
type ii error is statistics, not mathematics. there is no equivalent concept in CS because type ii error relates specifically to statistical inference and hypothesis testing.

that said, if you are just pointing to a box in a confusion matrix and saying "TYPE II ERROR," you are probably trying too hard.

Eh, but if you've taken a machine learning course, you should have seen the notion of false positive/false negative when you cover any kind of classification technique.
but they're not actually equivalent, in spite of tables like this [0]. type ii error is a false negative result in the context of a test, where you have to understand which hypothesis is which and exactly what you are accepting or rejecting (hypotheses are not always as simple as hotdog/not-hotdog); if your listener doesn't know what statistical tests mean or wasn't following the setup, they have to stop you and ask.

[0] https://en.wikipedia.org/wiki/Type_I_and_type_II_errors#Tabl...

Granted, Type II error and confusion matrices are covered in more basic statistical classes, and are indeed important for hypothesis testing.
I think the point the parent might have been making is that many people (or maybe just me) know "type II error" by the far more self-explanatory name of "false negative".
What's a "type two" error?

I had to google it. It's a false negative.

A "Type 1" error, is a false positive.

Is this like how people overuse the term "orthogonal"?

"Type I" and "Type II" errors are some of the stupidest and most obfuscatory academic terminology ever invented, and (as an academic) I absolutely refuse to make the effort to learn which way round they go. Just call the bloody things what they are: false positives and false negatives. (Getting seriously OT now, but Kahneman does something annoyingly similar with his talk of "System 1" and "System 2" in Thinking Fast and Slow).
As a computer scientist/programmer, there are three numbers, 0, 1, and infinity. If you're going to index your errors by the natural numbers, and you've got Type 1 Errors and Type 2 Errors, my next question is what a Type 3 Error is, and you know what my next question after that is.

Otherwise, please take this wisdom from programmers, who deal with this sort of thing all the time, and use an enumeration, in this case, {False Positive, False Negative} will do just fine.

When you are designing an hypothesis test the term positive and negative are not so clear. For example you can test that mean weight of bags is greater than 5.0 kg or smaller than 5.0 kg, both test are different, and some times you can accept both greater and smaller than 5.0 kg. The philosophy of hypotesis test is not as clear as a standard tests for pregnancy. In other terms,in some cases the H0 hypothesis is symmetric (>= versus =<) and is not clear what a positive result should be, you have to state clearly what is the H0 hypothesis. In a pregnancy test everyone agree than H0 is that you are not pregnant, that is in my HO the semantic difference between Type 1 error and false negative.
What are some synonyms to Kahneman's System 1 and System 2 then? Because Type 1 and 2 errors seem to be completely equivalent to false and positive negatives. I think Kahneman motivates his decision to introduce the terms System 1 and 2 quite well in his book, and I don't know of any direct counterparts.
Jonathan Haidt proposed a similar system in his book "The Happiness Hypothesis". He called it the automatic and controlled sides. The automatic side/system 1 is also what's being described in the book "The Inner Game of Tennis". I would summarize the two sides as the reflexive and the deliberate sides.
If the title of his book is justified, maybe the fast and slow systems?
Ahaha, I agree so much the the Kahneman jibe. How could someone so smart pick names so f*cking dumb?!
So, to put this in human terms.

A false positive or false negative, can be like a pregnancy test.

A false positive, can be where the pregnancy test shows your wife is pregnant, but she is not. And the baby never arrives. Phew, dodged a bullet!

A false negative, can be where the pregnancy test shows your wife is not pregnant, but she really is. And 9 months later, a baby accidentally pops out. Oh crap!

Of course false-positives can also be bad. To use your example "you then spend $10,000 prepping for the baby, but it never comes, what a waste!"
> Jaccard indexes

Funny you mention Jaccard; I was looking up if IoU (Intersection over Union) has any other name known to ML people when I was preparing my self-driving car presentation (IoU is used in semantic segmentation), and found out it is called Jaccard index as well. To my surprise, all ML experts I know knew about IoU but nobody about Jaccard. I guess it might depend on which university you attended?

i think the term is more prominent on the NLP side, via information retrieval (IR) and clustering. i first saw it in IR, and you'd see it in stanford's CS 124 or CS 224N, for example. if the parent is talking about people who are working on a system that has text-understanding component, i can understand their surprise.
For anyone else who wondered what the Jaccard index is, it's also referred to as Intersection over Union.

...and if you haven't come across that either, see https://en.wikipedia.org/wiki/Jaccard_index for details.

Was the project the person was working on having any success? If so, you might be unfairly ignoring their positive contributions and focusing only on this one negative sign that you observed.
My ML model is 99.7% sure you're a gatekeeper. Might be a type 1 error, though.
What's topic drift?
It seems to be a specialized term referring to the change in focus of blogs [0] and online communities [1] over time. This strikes me as a very specialized concept, rather than a generally-important term in machine learning as a whole.

Edit: To add my perspective, with years of industry experience and graduate-level machine learning coursework, I have never before encountered this term.

[0]: https://link.springer.com/chapter/10.1007/978-3-319-16354-3_...

[1]: http://catb.org/jargon/html/T/topic-drift.html

Also seems closely related to the concept of stationarity in time series analysis [https://en.wikipedia.org/wiki/Stationary_process]
maybe he meant concept drift? https://en.wikipedia.org/wiki/Concept_drift
I did, yes.
We have to separate AI researcher and implementation engineer. These types of crash courses help get you to the point where you can reasonably work under PhD level people and write code to test, scale, and deploy their ideas.

For many current applications of ML this is acceptable because you're just stealing an idea from a paper or stealing ImageNet to recognize your problem. For anything else you really need to pay up and fight with Google for a real expert.

Exactly. Somewhat akin to graphics programming. There are much smaller groups that work on actually building 3D graphics engines, however, many developers take those engines and use them to build successful applications and games.
So we need to wait for Unity of ML? With Asset Store selling models and datasets.
Will we reach a state where ML is as accessible for implementers as SQL databases? I still remember the time when databases were only for experts.
Yes hopefully. Take a look at BayesDB (and the underlying crosscat algorithm) and probabilistic programming.
>you can reasonably work under PhD level people and write code to test, scale, and deploy their ideas.

Which phd, though? All PhDs are not equal (see politics vs computer vision). Also, PhDs are hardly the holy grail of demonstrating capability, accuracy or intellect, especially given the reproducibility crisis, phds as a measure of any of those things should be used carefully.

They are talking about PhDs in Machine Learning of course
They really don't. There is a link to the Wikipedia page for matrix multiplication. If these are the people you want to hire, you might as well start outsourcing or generating random numbers
Gate keeping is only obsolete when it ceases to have impact. The reality right now is that ML is extremely hard to enter even for a very knowledgeable and deeply experienced but non-credentialed (by degree) person.

It will be interesting to see how the situation evolves but my own observations are that people trying to enter the space might be better off getting a quickie masters if they can afford the time or cost than to try and bootstrap it.

Even people getting a quickie masters is hit/miss in my experience. At the end of the day, successful machine learning engineers require a whole suit of different skills, both technical, communicative, and even life skills that don't really exist for software devs. Not all those can be taught in 3 months, 2 years or even 6 years.
> both technical, communicative, and even life skills that don't really exist for software devs

Not a fan of this "data scientist is a unicorn" style of thinking. The best people in any profession (especially software engineering) also use these skills in their day-to-day work.

Data science isn't yet as stratified as software engineering, so there's less room for those without those "unicorn" skills. 10 years ago, there was no room at all. 10 years from now, there will probably be plenty of undergrads hired as junior data scientists.
Life skills? Communicative skills? What?
IMO essentially ML experts don't work in a bubble and may interface with potentially anyone at a company; C-level, engineering, product, marketing, ops, etc etc. What other tech-employee needs that flexibility? So, I grouped communication / life skills into being able to understand, read, interpret and ultimately provide value to potentially any team. Just having the technical skills will only get you so far.
Isn't this part of what most software engineering degrees teach though? Particularly surrounding project planning and requirements gathering?
I might be an outlier, but I interface with all of those on a daily basis in my role as a software engineer
IMO software engineering experts (leaders) need to do the same thing.
I agree with this comment. My experience has been that people don't really look at your resume unless you have machine learning experience on your resume or one of the stats type majors
> "you can't use AI/ML unless you have a PhD/5 years research experience"

This is not true since a few years ago. But the fact that you can use it doesn't mean you understand what is happening and why it works in development but not in production. Everybody can copy a jupyter notebook and train a TensorFlow model in ImageNet. Now go to a new domain with very few information like 3D models and create a new network to be trained in that dataset. How many people that can train ImageNet can do the latter? Even inside deep learning experts in image classification fail in reinforcement learning domains and need a couple of years to be completely productive.

I fully agree with you that after a MOOC you've barely scratched the surface and until you're implementing them yourself then you're not going to jump into a ML job.

However personally I view the rest of the opposite way round. Getting through a course on Deep Learning takes months [0]. Then reading through Keras code once you understand the appropriate NNs is easy.

For example it takes a while of going through Neural Networks to understand ResNets. But if you understand ResNets then looking though Keras code that creates a ResNet [1] is easy.

If I want to build a NN of any sort in Keras I can just Google for it. However there's no simple Googling you can do to teach yourself NN in an easy to follow structured way.

[0]: https://www.deeplearning.ai/

[1]: https://github.com/Hyperparticle/one-pixel-attack-keras/blob...

Understanding NNs is easy. Understanding, collecting, and cleaning up data is the hard part.

Also, DL != ML.

Paraphrasing "The Tao of Network Protocols": If all you see is DL, you see nothing.

There are a tremendous number of people outside of programming who spend much or all of their work time collecting, cleaning up, and understanding data. Think teachers, accountants, traders - essentially everyone who spends a lot of time in spreadsheets.
The parent was referring to Keras which is a NN API hence why I responded talking about NN.
Isn't this meant to be an introduction? I'm not sure who comes out of a crash course assuming they're an expert.
> taking and passing a crash course doesn't mean you'll be an expert in the field (and this applies for most MOOCs, honestly)

You're stating the painfully obvious here. I doubt anyone reading HN is under the impression that they'll be an expert after a single online course.

This is just a marketing stunt by Google to ensure their tooling is the defacto standard for AI/ML so that Google can dominate the AI/ML market they way they dominated Internet Search.

> taking and passing a crash course doesn't mean you'll be an expert in the field (and this applies for most MOOCs, honestly)

Any field that you can become an expert in with a 6-week course or less is not a field that should be paying even high 5-figure salaries. Or, conversely, any field which pays 6-figure salaries is either not accessible via an MOOC, or is massively overinflated and probably in a bubble.

The real barrier to entry to ML is statistics. Most computer science degrees require an intro to statistics class, but if you really want to understand ML and where it should and can be applied appropriately you need a much deeper understanding.

IMO, it's much easier to pick up the programming required for ML than the statistics. This was reflected in the classes I took as a double statistics/computer science major. Most of the people in my CS department's machine learning course were statistics students looking to go into data science, not computer programmers looking to get in on the ML trend.

> Yes, AI/ML MOOCs teach the corresponding tools well, and the creation of new tools like Keras make the field much more accessible. The obsolete gatekeeping by the AI/ML elites who say "you can't use AI/ML unless you have a PhD/5 years research experience" is one of the things I really hate about the industry.

The problem is that having a hammer makes one see everything as a nail. Sure, given a suitably clean set of images, anyone who's done a couple of tutorials will be able to apply a pre-trained neural on them to get something.

The hard part is getting an understanding of what tweaks to use when, and when to give up on a method. Otherwise, it is very easy to get carried away and waste time/resources.

For that, one needs to develop a good understanding of the landscape of ML algorithms, why each of them works and how they could break. That typically takes (intensive) experience or an understanding of the theory. Otherwise you'll be doing a brute-force search through a list of possible algorithms. As they say, "a few days in the lab might save a few hours in the library..."

Yes, things can get painful during hiring because the process is broken as it is, with additional complications due to not knowing how to vet for quality in a nascent field. But the "ML elite" are not morons and they don't mean to be obnoxious gatekeepers.

> The obsolete gatekeeping by the AI/ML elites who say "you can't use AI/ML unless you have a PhD/5 years research experience" is one of the things I really hate about the industry.

So what are they hoping to achieve with this course? I'm genuinely asking because part of me wants to take the course, but another part of me feels like what's the point if, even through many additional courses to build up a skill set, Google wouldn't hire you as an ML engineer unless you basically start your career back to a junior engineer but in machine learning at another company.

I have a feeling that they're trying to get people familiar with TensorFlow and thus very compatible with their cloud computing services.
I dunno.. for this level of ml scikit/numpy is way more accessible than tensorflow.
Look at it from an interview perspective. If I ask "are you interested in exploring ML", and you're enthusiastic, my next questions are : What have you done? Have you taken any courses? GitHub? Blog Posts?

If the answer is that you're waiting for a special sign that it's worth doing before making an effort, then that really tells me that your enthusiasm for doing ML is not reality-based. Doing the ML thing is a pretty different mindset from other software jobs.

No but you'll ready yourself for the (geometric) programming of the next twenty years or so.
The other day I met with someone who was visiting my city to attend a big ML conference. In the course of our discussion, it transpired this person did not know the Halting Problem. He'd "heard of" Turing machines, but nothing more than "hearing" of them.

Gatekeepers shouldn't keep gates just for gatekeeping sake. But if so-called ML experts don't even know undergraduate computer science, that should really give you pause before you open up your wallet for them.

I could have the same reverse worldview:

"I attended a big software dev conference. Someone I met did not know about data bias. They heard of gradient boosting but nothing more than hearing them. If so-called dev experts don't even know undergraduate statistics, that should really give you pause before you open up your wallet for them."

Maybe I'm an iconoclast, but I'd respect that person more for not trying to bullshit his way out of it.
That's a great point. It shines a positive light on the gentleman I spoke with, and a negative light on the industry as a whole (if bs is so rampant that merely admitting not knowing something makes someone shine)
Why does an ML-expert need to know the halting problem?

Considering that ML is really a CS-oriented form of statistics, why would you expect a statistician to know CS theory?

Thinking more, it's the misleading names ("machine learning", "AI") that rustle my jimmies so much.

Sure, you don't need to know the halting problem to approximately solve MNIST by fitting a million-parameter curve to a dataset.

But you're misleading people if you're claiming to have any kind of insight into how computers can be made intelligent, or how computers can "learn", when you don't even know the halting problem.

I disagree. Frankly, for a lot of people and a lot of contexts, I don't think the halting problem is particularly important. You're using understanding of it as a shibboleth for exposure to common curricula about theoretical computation. But you can even know a lot about practical computation and not know anything about the halting problem. Curious: has your knowledge of the halting problem ever actually saved you time or effort in your work? If so, how?

Turing's work on the limitations of his machine are interesting, and I'm sure people with a deep understanding of them can advance the study of computation.

I think you're just being dismissive of skillsets which aren't your own. I think you're just bothered by the fact that AI and ML are being advanced more by people with more knowledge of linear algebra and statistics than computer science. And realize that it's the arrogant among them that will dismiss you as "just a technician."

Anyone who is looking down on either "scientists" or "technicians" should get over themselves.

> Curious: has your knowledge of the halting problem ever actually saved you time or effort in your work? If so, how?

Not OP, but I'm working a lot with ontologies. Some ontologies representations are undecidable, while other languages are not very expressive but can be manipulated in polynomial time. Had I not known that, I would still be like "crap, why does it take so long? I must have a bug somewhere, maybe I should switch to C".

> AI and ML are being advanced more by people with more knowledge of linear algebra and statistics than computer science.

Just answered OP about that, but actually, symbolic AI is pure computer science. It does not get as much publicity as ML currently, but believe me, it's everywhere: at the core of almost all package managers, like debian's apt-get or maven, at the core of most advanced static code analyzers, etc.

This is a recent shift lead by the ML trend. Traditionally (like 5 years ago), ML and AI were two different things, AI being the term for symbol manipulation. Expert systems, inference engines, constraint programming, SAT solving for instance. These domains are typical CS stuff: inference, complexity classes, low-level representation of data, etc. You don't need that much knowledge in math/statistics to be proficient in those fields, but you rather know what the halting problem is.

I'm working in the symbolic AI field, and sometimes use ML techniques. They are complementary. To me, ML is about induction, AI is abut deduction. They don't solve the same kinds of problems and they tend to work pretty well together.

I guess "dynamic programming" must really bother you. That field was named completely arbitrarily, to secure funding.

The more you look around, the more you find science concepts are named for marketing purposes.

Heck, "data scientist" is a bit of nonsense.

Does this just come down to a semantic idea that if something isn't in pursuit of AGI, its not really AI? That feels unfair to most of these researchers who absolutely disagree with that.

And to consider these algorithms to not "learn" is similarly unfair. They do. They learn to solve specific problems (at least right now), but they do learn.

would you not expect your hypothesized (theoretical) ML expert to understand boosting, which is generally explained in terms of PAC learning, which draws on computational complexity?

that said, i'd also expect a phd in statistics to be able to figure out boosting without taking an undergrad course that worked up from automata. so the halting problem test, while it does capture something, may not be quite right.

Why? Most of that cruft is abstracted away, computation only gets cheaper over time (a world class AI rig cost ~30k, a decent one for 2k) and most applications of ML run on commodity hardware.
For one thing, it suggests that they are actually technicians, not the scientists they're selling themselves as.

That's fine if you want a technician (and if they're charging technician's rates).

I think when it comes to ML the CS experts with limited statistics knowledge are the technicians and the statistics experts with limited CS knowledge are the scientists, not the other way around.
And then that technician rate is X times what a technician rate would be for pure software dev.. what is your point?
> But if so-called ML experts don't even know undergraduate computer science

To be fair, Machine learning seems more closely related to applied mathematics - statistics/optimization than to computer science.

How does knowing that it's impossible to predict whether an infinite loop exists in a piece of code yield an actionable piece of wisdom that this ML expert should have?

I'd suppose that most developers, formal education or not, would have encountered an infinite loop at some point in their initial work with iteration or recursion.

How does knowing that Turing proved you can't predict this bug in a piece of code change anything?

I might genuinely be missing something important here - not trying to be snarky in my questioning.

It seems like obviously infinite loops are a disastrous bug for critical code - but what does knowing the formal name of the problem and background of its discovery give you?

I could understand if you were arguing in favor of test code or static analysis.

"turing machines" are cs 101?
Amended that to "undergraduate computer science"
> The obsolete gatekeeping by the AI/ML elites who say "you can't use AI/ML unless you have a PhD/5 years research experience" is one of the things I really hate about the industry.

It's the main reason why I decided to present a talk at the next PyCon Italy, as a very junior data scientist, to inspire other Python developers to learn some practical machine learning. If I could do it (and use it for a work project already) many other people can do (and no, I don't even have a degree in CS, just years of work experience)

>> The obsolete gatekeeping by the AI/ML elites who say "you can't use AI/ML unless you have a PhD/5 years research experience" is one of the things I really hate about the industry.

I'm going to have to ask who exactly are those AI/ML elites who say "you can't use AI/ML unless you have a PhD/5 years research experience".

TLDR: Taking a class is fine, but nothing beats real-world practice.

Wonder where I've heard this one before. :)