Hacker News new | ask | show | jobs
by jerf 5344 days ago
I'd also observe that "let's create an 'AI' by piling on the special cases until we have a generally-capable tool" has been tried numerous times, and it's a known failure case. After a certain point, the piled-on rules being negatively interacting with each other, and it requires one of 1. true AI (thus begging the question) or 2. treating the set of rules as one of the quirkiest programming languages ever to make effective use of it.

Many people are speculating about how wonderful Siri will be in the (near) future; I'd submit that the evidence suggests that it has pretty much come out of the gate with all the power it's going to have for the foreseeable future. Natural language querying seems to have been stuck at the same plateau for a long time, just like voice recognition technology has been.

2 comments

Maybe for domain independent stuff, but for a domain specific thing (like scheduling), I think heuristics could go a long way. Yes, it's like a programming language, and yes, it has quirks.

Just making sure the weak AI is clear about its interpretation through explicit confirmation ("Sir, I understand that you would like to launch the missiles at Russia in 15 minutes, am I correct?" "No. Siri, please book lunch with my sisters at the Russian restaurant on the 15th.") would probably make up for a lot.

Is voice recognition technology really on a plateau? I have highly accurate speaker independent speech recognition in my pocket now. I'm using it to dictate this response. It didn't require any training, and it's nearly perfect. I may be mistaken, but I believe this capability is relatively new. Even if it's not, it's so close to being perfect that there's little room for improvement, or so it seems.

And yes, I know that all of the smart are not in my pocket, but rather in the cloud.

I bet you're enunciating clearly and that the mic is not picking up much background noise. (Note that you may be in an environment with some noise but there are easy ways to create noise-cancelling microphones or directional microphones that are very effective. You'd have to check the recording to see how much noise is on it.) I could get the same results from Dragon's voice recognition software with careful enunciation and a bit of practice 10 years ago. It is also well-known how to get very good accuracy on a restricted dictionary. What has not been solved is improving beyond that. Situations in which humans will easily extract speech, so easily that we do things like casually lay music tracks over a speaker without much thought, software will still just fail miserably for, as far as I know.
Was Dragon's software speaker-independent ten years ago, and did you have to train it? I looked into this recently and couldn't find any speaker-independent PC software now, and I think it all required training. Being able to just pick up and talk without any preliminaries is still a pretty big deal.

I'm sure you're right about the other deficiencies. "Almost perfect" is very strong, after all. Still, it's really excellent, and in my experience is much better than it was just a few years ago.