Hacker News new | ask | show | jobs
by xamuel 3031 days ago
The other day I met with someone who was visiting my city to attend a big ML conference. In the course of our discussion, it transpired this person did not know the Halting Problem. He'd "heard of" Turing machines, but nothing more than "hearing" of them.

Gatekeepers shouldn't keep gates just for gatekeeping sake. But if so-called ML experts don't even know undergraduate computer science, that should really give you pause before you open up your wallet for them.

7 comments

I could have the same reverse worldview:

"I attended a big software dev conference. Someone I met did not know about data bias. They heard of gradient boosting but nothing more than hearing them. If so-called dev experts don't even know undergraduate statistics, that should really give you pause before you open up your wallet for them."

Maybe I'm an iconoclast, but I'd respect that person more for not trying to bullshit his way out of it.
That's a great point. It shines a positive light on the gentleman I spoke with, and a negative light on the industry as a whole (if bs is so rampant that merely admitting not knowing something makes someone shine)
Why does an ML-expert need to know the halting problem?

Considering that ML is really a CS-oriented form of statistics, why would you expect a statistician to know CS theory?

Thinking more, it's the misleading names ("machine learning", "AI") that rustle my jimmies so much.

Sure, you don't need to know the halting problem to approximately solve MNIST by fitting a million-parameter curve to a dataset.

But you're misleading people if you're claiming to have any kind of insight into how computers can be made intelligent, or how computers can "learn", when you don't even know the halting problem.

I disagree. Frankly, for a lot of people and a lot of contexts, I don't think the halting problem is particularly important. You're using understanding of it as a shibboleth for exposure to common curricula about theoretical computation. But you can even know a lot about practical computation and not know anything about the halting problem. Curious: has your knowledge of the halting problem ever actually saved you time or effort in your work? If so, how?

Turing's work on the limitations of his machine are interesting, and I'm sure people with a deep understanding of them can advance the study of computation.

I think you're just being dismissive of skillsets which aren't your own. I think you're just bothered by the fact that AI and ML are being advanced more by people with more knowledge of linear algebra and statistics than computer science. And realize that it's the arrogant among them that will dismiss you as "just a technician."

Anyone who is looking down on either "scientists" or "technicians" should get over themselves.

> Curious: has your knowledge of the halting problem ever actually saved you time or effort in your work? If so, how?

Not OP, but I'm working a lot with ontologies. Some ontologies representations are undecidable, while other languages are not very expressive but can be manipulated in polynomial time. Had I not known that, I would still be like "crap, why does it take so long? I must have a bug somewhere, maybe I should switch to C".

> AI and ML are being advanced more by people with more knowledge of linear algebra and statistics than computer science.

Just answered OP about that, but actually, symbolic AI is pure computer science. It does not get as much publicity as ML currently, but believe me, it's everywhere: at the core of almost all package managers, like debian's apt-get or maven, at the core of most advanced static code analyzers, etc.

This is a recent shift lead by the ML trend. Traditionally (like 5 years ago), ML and AI were two different things, AI being the term for symbol manipulation. Expert systems, inference engines, constraint programming, SAT solving for instance. These domains are typical CS stuff: inference, complexity classes, low-level representation of data, etc. You don't need that much knowledge in math/statistics to be proficient in those fields, but you rather know what the halting problem is.

I'm working in the symbolic AI field, and sometimes use ML techniques. They are complementary. To me, ML is about induction, AI is abut deduction. They don't solve the same kinds of problems and they tend to work pretty well together.

I guess "dynamic programming" must really bother you. That field was named completely arbitrarily, to secure funding.

The more you look around, the more you find science concepts are named for marketing purposes.

Heck, "data scientist" is a bit of nonsense.

Does this just come down to a semantic idea that if something isn't in pursuit of AGI, its not really AI? That feels unfair to most of these researchers who absolutely disagree with that.

And to consider these algorithms to not "learn" is similarly unfair. They do. They learn to solve specific problems (at least right now), but they do learn.

would you not expect your hypothesized (theoretical) ML expert to understand boosting, which is generally explained in terms of PAC learning, which draws on computational complexity?

that said, i'd also expect a phd in statistics to be able to figure out boosting without taking an undergrad course that worked up from automata. so the halting problem test, while it does capture something, may not be quite right.

Why? Most of that cruft is abstracted away, computation only gets cheaper over time (a world class AI rig cost ~30k, a decent one for 2k) and most applications of ML run on commodity hardware.
For one thing, it suggests that they are actually technicians, not the scientists they're selling themselves as.

That's fine if you want a technician (and if they're charging technician's rates).

I think when it comes to ML the CS experts with limited statistics knowledge are the technicians and the statistics experts with limited CS knowledge are the scientists, not the other way around.
And then that technician rate is X times what a technician rate would be for pure software dev.. what is your point?
> But if so-called ML experts don't even know undergraduate computer science

To be fair, Machine learning seems more closely related to applied mathematics - statistics/optimization than to computer science.

How does knowing that it's impossible to predict whether an infinite loop exists in a piece of code yield an actionable piece of wisdom that this ML expert should have?

I'd suppose that most developers, formal education or not, would have encountered an infinite loop at some point in their initial work with iteration or recursion.

How does knowing that Turing proved you can't predict this bug in a piece of code change anything?

I might genuinely be missing something important here - not trying to be snarky in my questioning.

It seems like obviously infinite loops are a disastrous bug for critical code - but what does knowing the formal name of the problem and background of its discovery give you?

I could understand if you were arguing in favor of test code or static analysis.

"turing machines" are cs 101?
Amended that to "undergraduate computer science"