Hacker News new | ask | show | jobs
by shafte 2761 days ago
I seem to have missed the Twitter spat that precipitated this essay, but I don't quite buy the larger argument he's making. We should judge approaches to AI based on their results, not on their conformance to a (vague, incorrect, untested) model of human cognition.

Symbolic AI fell out of favor primarily because it was not delivering results in impactful problem areas. Deep learning is currently popular because we are nowhere near the limit of what results it can produce.

Can this change? Of course! The history of deep learning itself proves as much. But if you want to genuinely influence the direction of the field, you have to lead by example and produce novel/interesting research results, not by kvetching in The New Yorker that your favorite approach is not getting enough attention.

4 comments

Symbolic AI fell out of favor because it was overhyped. It was delivering quite impressive results--just not the promised results. Neural nets fell out of favor in the 90s for exactly the same reason.

Both failures ultimately were caused by not enough computing power. Even though Deep Learning and Convolutional NNs look like major advances today, they never could have been practical before about 2005: There just wasn't enough computing power.

If modern computer power were thrown at symbolic AI the same way it's been thrown at NNs, it highly likely symbolic AI would experience similarly-impressive gains.

I don’t have enough knowledge to evaluate the idea that symbolic AI would scale as well as deep learning (or could I say statistical models in general?), but what has held back symbolic AI if not for the difficulty of making it work? Surely someone out there has tried in the past 5 years if the reward is that great.
It's mostly because symbolic AI is not fashionable. Most AI practitioners under 40 (and VCs) seem to think AI is DL and only DL.
Google, Apple, etc. tried to make symbolic AI work for their problems but found it had intractable problems. Google fairly publicly transitioned to a deep learning approach for translation which greatly improved their results. Apple is still struggling to make the transition for Siri's question answering.
How might symbolic AI become fashionable?
> If modern computer power were thrown at symbolic AI the same way it's been thrown at NNs, it highly likely symbolic AI would experience similarly-impressive gains.

What's the basis for this conjecture? Is there a mathematical model for symbolic manipulation that would benefit from parallel execution/GPUs the way ML applications do?

Symbolic AI needs computing power to counteract combinatorial explosion.

https://en.wikipedia.org/wiki/Combinatorial_explosion

The vulnerability of early logic-based AI to combinatorial explosion was the main argument against funding AI research put forward in the Lighthill Report, the document that shut down AI research in the UK in the 1970s and contributed to the AI winter on the other side of the Atlantic, also.

Obviously, today we have more powerful computers so combinatorial explosion is less of an issue, or anyway it's possible to go a bit further and do a bit more than it was in the '70s.

One area of symbolic AI that actually does benefit from parallel architectures (though not GPUs) is logic programming with Prolog. Prolog's execution model is basically a depth-first search, which lends itself naturally to parallelisation (one branch per search). Even more so given that data in Prolog is immutable (no mutable state, no concurrency headaches).

But, in general, anything people did 20 or 30 years ago with comptuers can be done better today. Not just symbolic AI or neural networks. I mean, even office work like printing a document is faster today and that doesn't even depend on GPUs and parallel processors.

>> (one branch per search)

Oops, sorry. Meant "one branch per processor".

My conjecture is about e.g. the Rete algorithm for rule search and the likelihood that it could be made more scalable on multicore and distributed hardware with modern functional data structures allowing easy rule updates.

I don't know whether rule search or logic unification could be mapped onto GPU or TPU operations; I suspect not, but it's worth looking into.

You realize someone has to write all those rules - and correctly - right? Are you saying that we literally didn't have the computing power to run all the rules we could actually write? I think you need to support this idea that symbolic approaches failed to lack of computing power.
Like I say in another comment, machine learning took off in part as a way to avoid having to hand-craft rules for expert systems' rule bases (although machine learning existed as a discipline from the early days of AI). So a lot of work on machine learning in the '80s and '90s went to learning rules.

For instance (also in another comment) Decision Tree learners basically learn a set of If-Then-Else rules. They're one type of symbolic machine learning and there's more where they came from (e.g. Ross Quinlan's FOIL, for First-Order Inductive Learner, which is basically a first-order version of decision trees; Inductive Logic Programming which I study for my PhD; and many, many more). This work has dwindled, but it's still going.

So, no, you don't have to write rules by hand, anymore than you need to set the weights of a neural net by hand. You can just learn them.

Yeah so why not just do it yourself ?

I really hate these kind of arguments where some people have strong beliefs that "it would work" and not do it or invest themselves. Its like being in a pub and listening people talk about politics while neverhave had any responsibilities

There are plenty of domains where symbolic AI has had every opportunity to use unlimited computing power, plus decades of design and experimentation, but has lost to deep learning.

a good area for examples is human games (e.g. chess, Go, Atari games, etc.) Symbolic AI has been pushed hard but has lost definitively to deep learning. Furthermore symbolic approaches had decades of investment, compared with less than a decade for the deep learning approaches.

Another good area for examples is natural language. Marcus admits that deep learning is the only viable approach to "speech understanding" (which really means transcription). He doesn't mention translation which demands a lot more "understanding" and where deep learning excels relative to symbolic approaches, again with decades of investment on the symbolic side and much less on the deep learning side.

Except for inherently symbolic problems like theorem proving, I can't think of any AI domain where deep learning doesn't dominate or seem likely to dominate symbolic approaches.

>> Symbolic AI fell out of favor primarily because it was not delivering results in impactful problem areas.

My understanding is instead that symbolic AI was working pretty damn well for its time. Expert systems routinely outperformed experts, for sure. The AI winter that killed them was brought on by political decisions taken by people who didn't really understand the field.

Here's a good read on that historical period:

Avoiding another AI winter, editorial in IEEE Intelligent Systems.

https://www.computer.org/csdl/mags/ex/2008/02/mex2008020002....

Btw, "Intelligent Systems" is such a funny little expression. Basically, it was used by AI researchers during the 80's AI winter to be able to get funding for their work; because they wouldn't get any if they called it what it was, AI.

Its a long article but he’s advocating for a hybrid approach to deal with problems that DNNs encounter today which he believes are fundamental to DNNs (reasoning, causality). An example of such an approach taken with success is given at the end: https://arxiv.org/pdf/1711.04574.pdf
How are you judging success here? The paper has no experiments at all, not even toy.
> We should judge approaches to AI based on their results

That's why the article discusses examples where currently popular approaches fail.

> not on their conformance to a (vague, incorrect, untested) model of human cognition.

What model of human cognition is "incorrect"? And is it the one presented by Marcus, or a strawman?

> But if you want to genuinely influence the direction of the field, you have to lead by example and produce novel/interesting research results

Are you claiming Marcus has produced no interesting research results?

> not by kvetching in The New Yorker that your favorite approach is not getting enough attention

Why use the term "kvetching"? I'm curious.

> What model of human cognition are you claiming is "incorrect"? And is it the one presented by Marcus, or are you strawmanning?

The model of human cognition I'm referring to is the hybrid connectionist-symbolic one that Marcus is well known for advocating (are YOU strawmanning? lol). I'm criticizing it for being more a theoretical model than one grounded in the physical realities of the brain, which of course no one really understands. Proposing a research program on that basis requires a high burden of proof.

> Are you claiming Marcus has produced no interesting research results?

Yes I am claiming that, if the benchmark for "interesting" is deep learning.

There are indeed areas where deep learning is limited, and hybrid approaches could be superior. I would argue that there is not even close to enough evidence that a hybrid approach has improved generalizable power.

> Why use the term "kvetching"? I'm curious.

Huh? I guess it's the term my mother would use.

>the physical realities of the brain, which of course no one really understands.

No-one in machine learning, yeah, because machine learners mostly don't take neuroscience classes ;-).

> The model of human cognition I'm referring to is the hybrid connectionist-symbolic one that Marcus is well known for advocating

> I'm criticizing it for being more a theoretical model than one grounded in the physical realities of the brain, which of course no one really understands

You're contradicting yourself. On the one hand, you claim Marcus' model is "incorrect". On the other, you claim there's insufficient evidence either way. Which is it?

> are YOU strawmanning? lol

Do you know what the term "strawmanning" means? What could I possibly be strawmanning since I was asking for clarification?

> Proposing a research program on that basis requires a high burden of proof.

As opposed to...?

> Yes I am claiming that, if the benchmark for "interesting" is deep learning.

"Deep learning" isn't the correct benchmark since that's what Marcus is critiquing (to some extent) in the first place.

> I would argue that there is not even close to enough evidence that a hybrid approach has improved generalizable power.

Then you'd be wrong. Here's a good place to start your research:

http://science.sciencemag.org/content/331/6022/1279

> Huh? I guess it's the term my mother would use.

Your mother taught you to describe scientific debate as "kvetching"? That's disappointing.

Can you please stop posting to HN in the flamewar style? You've been doing it a lot. We want thoughtful dicussion here, not gotcha vendettas here.

In particular, your comments have broken this guideline: "Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith."

https://news.ycombinator.com/newsguidelines.html

> Can you please stop posting to HN in the flamewar style? You've been doing it a lot.

Can you clarify? How have I "been doing it a lot"? And how was my comment "in the flamewar style"?

> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize.

Which part of the GP's comment do you think I'm unfairly interpreting?

Here are other recent cases: https://news.ycombinator.com/item?id=18503013, https://news.ycombinator.com/item?id=18447094, https://news.ycombinator.com/item?id=18378261. That is what we're asking you not to do.

HN threads are supposed to be thoughtful conversation—not cross examinations or verbal boxing matches, let alone setups to make other people look bad.

> Are you claiming Marcus has produced no interesting research results?

Not the OP, but I'm not familiar with Marcus' contributions. What would you consider his top contributions? Are they all purely theoretical, or is there something that's already been applied in the real world?