| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Nexxxeh 3741 days ago
	But presumably you could have the Pi listen for a trigger word or whistle or whatever using software running locally, and when triggered, kick over to the Alexa API?

4 comments

tmuir 3741 days ago

You could setup an IFTTT "Do" button [1] with the Maker Channel, which allows you to make an arbitrary web request. Then have a server running locally that can receive the request and trigger the recording. Nodered [2] would make setting that server up pretty simple.

[1] https://ifttt.com/products/do/button [2] http://nodered.org/

link

IshKebab 3741 days ago

There's no good open source software to do this.

Also you need a microphone array to do it reliable (the Echo has 7 microphones).

link

jjwiseman 3741 days ago

Yes, but it won't work as well as the Echo, especially in a noisy environment.

link

jjwiseman 3741 days ago

To expand a little on this: The Echo has a 7-microphone array which is crucial to speech recognition accuracy. This gives it the best far-field recognition ability of any consumer product I've seen, with the ability to stay accurate even if you're across the room, with music playing. That's just the hardware, and replicating it's abilities will not be easy.

On the software side, supposedly they're using Nuance for recognition. Nuance isn't cutting edge: In the tests I've done, Nuance has a Word Error Rate (WER) that's 10%-20% higher than Google's, but it's still much better than something like Pocketsphinx or any other open source recognizer.

There are a lot of factors that go into making a speech interface a good experience for users: Good recognition accuracy even with background noise, good voice activity detection (even with background noise), very accurate word spotting, low latency. It's hard to hit all these things well enough to make the interface usable.

link

Slippery_John 3741 days ago

That's against the Alexa Voice Service ToS though

link