Hacker News new | ask | show | jobs
by nutanc 2421 days ago
This is a good balanced article that gets a lot of things right. We should take a forgiving approach when we talk about AI systems. And as the author points out the problem is not that AI systems dont have understanding yet. The problem is with the hype which leads many to believe that we are close to building systems which can understand us.

That said, I have a small problem with the examples presented to say that already machines understand us :)

The article says 'For example, when I tell Siri “Call Carol” and it dials the correct number, you will have a hard time convincing me that Siri did not understand my request"

Let me try to take a shot at trying to explain that Siri did not "understand" your request.

Siri was waiting for a command and executed the best command that matched. Which is, make a phone call.

It did not understand what you meant because it did not take the whole environment into consideration. What if Carol was just in the other room. A human would maybe just shout "hey Carol, Thomas is asking you to come", instead of making a phone call.

If listening to a request and executing a command is understanding, then computers have been understanding us for a long time. Even without the latest advances in AI.

6 comments

> It did not understand what you meant because it did not take the whole environment into consideration.

This is the crux of the matter. These voice recognition agents are trained with goal of accurately modelling a function that converts recorded sound to a series of words, and then act on those words to perform the most appropriate action. They are NOT trained to model the entire world, which is an incredibly complex task that no one has been able to formulate as a problem that computers can solve, yet. Humans on the other hand, have a machine that is extremely well-equipped to do just that - the brain. And that is exactly why humans are able to "understand" things, while we feel that machines are not, with our definition of "understand".

In the far distant future, if and when we do figure out a way to model the entire world, come up with suitable objective function, and solve it on a computer, there's no reason why that machine should be any less capable of understanding things than the average human.

I think this is partly down to us humans defining "intelligence" as "like us".

We have a very specific set of evolved traits that define our understanding of the universe. A lot of that is social. So our "understanding" of the phrase "call Carol" includes a wide range of social cues about what that means, and your example is perfect: "call Carol" means that I want to talk to her, and that would be better done in person if possible, but that "if possible" has a more-or-less specific range of "if she's within earshot so I can yell for her", which is limited to the range of a human voice (but not the maximum range, like screaming, but just a normal yelling range). Which is less if the door is closed, or there's music playing, or Kevin is trying to nap in the other room. And not at all if we're in a library, or concert, or even a public space where yelling would draw attention. If "call Carol" has to include all of these to qualify for "understanding" then I think I know some people who fail at this test.

My go-to thought experiment on this is Dolphins. Dolphins are intelligent, have language, etc. But their understanding of the world must be so different. Trying to explain to a dolphin what "tripping someone up" means is going to be tricky. They may understand the words, but they'll never understand the concept.

We swim in a sea of social cues and non-verbal communication. We can program an AI to imitate more and more of this, and be aware of more of it, but it's like teaching dolphins about long-distance running. It's never going to come naturally. And they're never going to evolve that understanding naturally (like we do as children) because it's not in their nature. We anthropomophise our machines a lot, and we assume that they'll grow (like children) to grok all of our social cues eventually, because our only experience of similar situations is, well, children. But they're just machines, designed for a single purpose. They're never going to grok this. They're never going to be "like us" and really understand all the social ramifications of "call Carol". At some point I think we're going to have to accept this, and say that the machine understands the phrase "call Carol" enough. TFA draws the line at the machine calling Carol, and that seems reasonable.

So the next version of Siri can locate Carol's phone in the next room and will just beep her phone to tell her to see you. Of course that's still not understanding.

The classic analogue is of course the Chinese room argument: https://en.m.wikipedia.org/wiki/Chinese_room

Which is an absolutely textbook example of begging the question.

If you could make a machine pass the Turing test it might be intelligent - but no one has, and it's debatable if it's even possible, and it's even more debatable if, hype notwithstanding, the Turing test is even a good test of human-equivalent intelligence, because it ignores side channels that are fundamental to human communication, including tone of voice, posture, and facial expression.

(Yes, people communicate over email/SMS. But no one communicates over email/SMS without an implied social context that hugely limits and simplifies the content of any conversation.)

It's not the "call Carol" problem that needs to be solved. It's the "understand the entire world context well enough to know how to call Carol without being told - which includes being able to research information that isn't already available, and also includes edge cases like 'We went to Carol's funeral last week' and 'Carol had her phone stolen yesterday' and 'Carol is flying to Australia and won't be receiving messages for another 12 hours" and "Carol prefers FaceTime to WhatsApp."

And so on.

Ultimately your toy machine has to show evidence that it understands the entire world and can learn about it like a human can - which includes being able to do original research that isn't a simple literal Google search, parse humour, understand emotional responses and common cultural references, and follow standard social protocols.

That's a much harder problem than having a vaguely plausible limited text-only conversation, whether it's in Chinese, English, or Swahili.

I would call the moving goalposts a subtle sign of a win as well if it is getting closer that previous "unthinkable" tasks need more qualifiers. Missing or adding them makes it easier or harder. To be a smartass anything can pass a text messages from the comatose (that is nothing) and nothing can reliably "prove" itself god by say resurrecting and teleporting your dead relatives to you would be obviously useful but impossible as that isn't something text messages can do.
We can't make a machine intelligent because we're not intelligent enough to make it, nor to understand understanding.
> Siri was waiting for a command and executed the best command that matched. Which is, make a phone call.

ISTM there's no more "understanding" involved in this than when I touch the Contacts icon on my screen, then "C", "A", "R", etc until Carol's entry is displayed, and then I touch the Phone icon to initiate a call.

The fact that the interface used was sound-waves that the device recognised as matching the keyword "call" and the contact-list entry "Carol", rather than my finger touching specific areas of the screen, may be a handy feature. Of course it's a triumph of signal processing, fuzzy recognition, etc. But there's no more "understanding" involved than in the touch-screen version of the action, or in typing a command and parameter into a terminal window.

Your neurons don’t understand who carol is either, they’re just automatically responding to stimulus.
But am "I" - my consciousness, my understanding - nothing more than a collection of impulses traveling around a particular network of neurons?

We don't know.

I'd suggest that anyone who purports to give a definitive answer to that is in fact making a leap of faith - in one direction or another.

> If listening to a request and executing a command is understanding, then computers have been understanding us for a long time. Even without the latest advances in AI.

I think this is a reasonable thing to say, in the limited way he has defined ‘understanding’. People forget what a titanic achievement that user interfaces that allow us to communicate our intentions to a computer and receive a relevant response actually are, whether it’s using a voice or clicking a button.

But do we communicate with a mechanical slot machine when we push a coin in and then pull the lever?
The problem is with the hype which leads many to believe that we are close to building systems which can understand us.

The problem with the hype is that we are nowhere close to building systems that understand anything.

All we've built are calculators on steroids so far.

Worse, we've build a conceptualisation of these phenomena that renders us as calculators on steroids! I don't think it's feasible to account for or describe anything more with the ideas we have so far.
Yes, and not helped by the optimistic appropriation of words like "intelligence" and "learning" to describe the main lines of research.