Hacker News new | ask | show | jobs
by spiralganglion 5344 days ago
I can't shake the feeling that the Siri of today is like the app ecosystem of the iPhone 1.

• That Apple has a really solid plan for this feature, and we're only seeing the very beginning of it now — the phase where we are introduced to the interface, before they blow the lid off and open it up to every imaginable use-case.

• That it will be significantly improved before most people ever buy a supporting device, so the handful of customers being burned by the somewhat-lacking version 1 product are vastly outnumbered by the people who get their first taste of the mature, fully-realized vision.

4 comments

Unlike some of Apple's other features, UI, Apps, and operating software, Siri is not something that will be easy to improve. Siri represents the cumulative efforts of decades of computer science research by numerous public and private entities.

While it's easy to add more voice actions, making advances on the underlying technologies will require additional decades of hard computer science research. Apple, having no R&D division, will not likely even contribute to this.

Unless your main complaint is a lack of canned question types that it can answer, you won't likely see the fast improvements you are expecting in the next few years.

I'd also observe that "let's create an 'AI' by piling on the special cases until we have a generally-capable tool" has been tried numerous times, and it's a known failure case. After a certain point, the piled-on rules being negatively interacting with each other, and it requires one of 1. true AI (thus begging the question) or 2. treating the set of rules as one of the quirkiest programming languages ever to make effective use of it.

Many people are speculating about how wonderful Siri will be in the (near) future; I'd submit that the evidence suggests that it has pretty much come out of the gate with all the power it's going to have for the foreseeable future. Natural language querying seems to have been stuck at the same plateau for a long time, just like voice recognition technology has been.

Maybe for domain independent stuff, but for a domain specific thing (like scheduling), I think heuristics could go a long way. Yes, it's like a programming language, and yes, it has quirks.

Just making sure the weak AI is clear about its interpretation through explicit confirmation ("Sir, I understand that you would like to launch the missiles at Russia in 15 minutes, am I correct?" "No. Siri, please book lunch with my sisters at the Russian restaurant on the 15th.") would probably make up for a lot.

Is voice recognition technology really on a plateau? I have highly accurate speaker independent speech recognition in my pocket now. I'm using it to dictate this response. It didn't require any training, and it's nearly perfect. I may be mistaken, but I believe this capability is relatively new. Even if it's not, it's so close to being perfect that there's little room for improvement, or so it seems.

And yes, I know that all of the smart are not in my pocket, but rather in the cloud.

I bet you're enunciating clearly and that the mic is not picking up much background noise. (Note that you may be in an environment with some noise but there are easy ways to create noise-cancelling microphones or directional microphones that are very effective. You'd have to check the recording to see how much noise is on it.) I could get the same results from Dragon's voice recognition software with careful enunciation and a bit of practice 10 years ago. It is also well-known how to get very good accuracy on a restricted dictionary. What has not been solved is improving beyond that. Situations in which humans will easily extract speech, so easily that we do things like casually lay music tracks over a speaker without much thought, software will still just fail miserably for, as far as I know.
Was Dragon's software speaker-independent ten years ago, and did you have to train it? I looked into this recently and couldn't find any speaker-independent PC software now, and I think it all required training. Being able to just pick up and talk without any preliminaries is still a pretty big deal.

I'm sure you're right about the other deficiencies. "Almost perfect" is very strong, after all. Still, it's really excellent, and in my experience is much better than it was just a few years ago.

People often get carried away thinking about linear progression from the current state when the general problem is NP-hard.

With Siri, however, I'm not interested in it being a person, but something that can help set reminders, timers, appointments, and dictate text messages while I drive. That's huge for me. Rather than breadth, if Apple focuses on depth then the problem is more tractable because you have more context with which to reduce complexity.

Apple's R&D is applied, so it will have a product focus and get to market quicker. If they can make Siri really good at a specific number of tasks then people will understand what it is good for, rather than be disappointed, and it will improve faster.

A thing like Siri might benefit tremendously from being open source.

Everybody with an itch to scratch and time on their hand would help to incrementally improve Siri for 1000's of specialist applications.

Apple touted Voice Actions in OS X 10 years ago. Why is it different/better this time?
That you don't usually have a mac in your pocket. Voice is only really good when your hands are busy and you don't have a chance to sit down and pull out a laptop.
I'm not sure it's a valid comparison.

Apple sold more iPhone4s than all previous iPhones combined. I expect that iPhone4S will sell even more. Releasing Siri as a beta to such a large customer base is not the same as releasing the "somewhat-lacking version 1" iPhone, and then iterating it forward.

I really don't get this notion of a lot of Apple supporters that just because they put something out and call it AI, Apple has now solidly invented AI. Weren't they usually praised for not announcing things before they are ready? According to your theory, with Siri they would have broken the rule. The current Siri would merely be an announcement that one day they would deliver real AI, and in the meantime they would deliver the current broken version.
In 5 years it might add to something, but I don't think people should be buying iPhone 4S because of it right now, when they're probably only going to use it in the first week.