Hacker News new | ask | show | jobs
by TheOtherHobbes 2954 days ago
I don't think the messianic cheering is the real problem.

Here's a short and not very complete list of unsolved meta-problems:

Even when products and systems are revolutionary, there are unexpected negative consequences (e.g. FB and Cambridge Analytica)

All systems can be trolled and abused, and if they can be, they will be (e.g. fake reviews on Amazon etc.)

AI doesn't actually work all that well yet. (Neither Siri nor Alexa truly pass a conversational Turing test, which means there's a lot of guessing about whether or not any novel request will generate a useful response.)

IT systems and products of all kinds are brittle, unreliable, and often downright stupid. Users don't trust updates and feature changes, and often they're right to do so. Given that, why would AI "products" be any better or more reliable?

2 comments

> AI doesn't actually work all that well yet. (Neither Siri nor Alexa truly pass a conversational Turing test...)

When we pass the Turing test it means we've got actual AI.

But I'm not sure there'll ever be a clear line. So Duplex kind of passes it in a very narrow context. Whether or not the person at the end of the line was actually fooled or not is a slightly different question. They could have just been humouring what they figured was a weird automated system.

But it's not like Google won't improve exponentially with this. They've now got a basic AI conversation system that they hope people will use and feed it data of actual conversations.

So Duplex v2 will have an expanded system where they can handle ten times the number of scenarios and questions.

The more I think about it, the more impressive it seems. Most attempts at a Turing test are text only where the subject is supposed to be a 13yo immigrant boy. Here Google's jumping straight to voice conversations.

I don't think there's a clear line either. Even the Turing Test is notional - a conversation with a high schooler is going to be easier to fake than a conversation with an English professor.

I can imagine in the future there will be some kind of approximate conversational AI rating analogous to Flesch-Kincaid for text.

But I left a problem off my list, which is that we unconsciously demand AI should be better than average human performance.

If you monitor your conversations with people, you'll find there are regular misses where one person either mishears words, doesn't understand what's said, or misinterprets a subtext.

We cut human conversations a lot of slack. We're used to thinking of humans as independent agents, and there are social conventions about asking for more information and admitting - or sometimes denying - mistakes.

But there's an unconscious expectation that AI should operate at a better-than-human level before it's considered reliable.

We're more likely to think "Stupid machine!" if something isn't understood than we would with a human. So AI will have to cross the Uncanny Turning Valley before we really trust it. And because we're dealing with automated interpretations of human agency, errors will be harder to forgive.

You can already see this with driverless cars, where any accident is considered a failure. Even though statistically an AI may be much safer than the average human, it's not considered good enough unless it can deal with situations that an average human would have no hope of dealing with.

> we unconsciously demand AI should be better than average human performance.

Yeah I agree it's an unfair demand.

Especially given how much more powerful human brains are than computers we should perhaps be having a go a humans for not trying hard enough.

The wins of things like Go and Chess by computers has been down played because humans 'only' learned that stuff 100,000 years ago.

Personally I think that driverless cars work better as passive systems that augment humans for the moment rather than the dodgy crossover that is Autopilot. I think that car AIs can be trained to deal with extreme circumstances by running simulations of crashes millions of times over and then they're capable of taking over if the driver ever becomes unwell or hits black ice.

But this is all temporary, as soon as their vision systems match humans they will only ever improve over what we have. This Stanford self-driving car sliding between four perfect donuts is amazing [0].

[0]: https://youtu.be/LDprUza7yT4?t=31m38s

> Neither Siri nor Alexa truly pass a conversational Turing test

Even with the constrained grammar, neither of them passes the test of reliably producing the same result for the same phrase under good conditions. If the error rate is well into the double digits for simple structured queries using a constrained vocabulary it seems like the Turing test is still pretty far off.