| I have an RSI and I've been coding by voice exclusively for about 7 years. I used a system built on top of Dragon for most of that and in the last year switched to Talon. I think there are multiple reasons: * The obvious market is dictation of natural language, but this isn't what you want for voice control. If you try to use long descriptive phrases as your command language everything takes forever. So instead you end up making your own mini command language where all of your common actions are a single syllable, but now it's no longer the English or other natural language that users already know. So now your product has substantial learning curve just like learning a new keyboard layout. * Everything other than talon has terrible latency. Most existing speech recognition engines were not designed with the kind of latency you want for quick one syllable commands. * In order for it to be really effective you need the cooperation of applications (this is why I've written extensive emacs integration). Some tools like window speech recognition try to hook in at the UI layer in order to figure out what text is in dialog boxes and such, but in practice they seem to do a pretty terrible job. Windows speech recognition has a very hard time consistently understanding what links you are trying to get it to click on for example. There's also a long tail of applications that just do their own custom UI rendering inside a blank canvas where no hook is possible. * Good speech recognition even if not specifically targeting computer voice control is a genuinely hard research problem, and standard benchmarks for accuracy are misleading. You see "95% accuracy" and you are like wow that's a high percentage computers almost have this speech recognition thing solved and then you think about it harder and you go wait a minute, that's one mistake every 20 words! Maybe you are still impressed, but then you have to take into account that when the computer does the wrong thing you'll need to issue more commands in order to correct it, which will are also likely be misinterpreted. When you make a typo with a keyboard the mistakes rarely cascade, you just hit backspace. |