Hacker News new | ask | show | jobs
by eiopa 3741 days ago
tl;dr:

It's literally a tutorial on configuring Alexa Voice Services + their sample code on Debian.

The way you interact with it is by clicking on a button in a Java app. No trigger phrase like Echo.

4 comments

But presumably you could have the Pi listen for a trigger word or whistle or whatever using software running locally, and when triggered, kick over to the Alexa API?
You could setup an IFTTT "Do" button [1] with the Maker Channel, which allows you to make an arbitrary web request. Then have a server running locally that can receive the request and trigger the recording. Nodered [2] would make setting that server up pretty simple.

[1] https://ifttt.com/products/do/button [2] http://nodered.org/

There's no good open source software to do this.

Also you need a microphone array to do it reliable (the Echo has 7 microphones).

Yes, but it won't work as well as the Echo, especially in a noisy environment.
To expand a little on this: The Echo has a 7-microphone array which is crucial to speech recognition accuracy. This gives it the best far-field recognition ability of any consumer product I've seen, with the ability to stay accurate even if you're across the room, with music playing. That's just the hardware, and replicating it's abilities will not be easy.

On the software side, supposedly they're using Nuance for recognition. Nuance isn't cutting edge: In the tests I've done, Nuance has a Word Error Rate (WER) that's 10%-20% higher than Google's, but it's still much better than something like Pocketsphinx or any other open source recognizer.

There are a lot of factors that go into making a speech interface a good experience for users: Good recognition accuracy even with background noise, good voice activity detection (even with background noise), very accurate word spotting, low latency. It's hard to hit all these things well enough to make the interface usable.

That's against the Alexa Voice Service ToS though
Does it mean that we can run it on any computer that runs Java? I read through the tutorial but couldn't find anything that specifically tied to raspberry pi.
Probably. The nice thing about the Pi is that it's cheap and has crazy low energy consumption
the trick left for the "makers" is to add a button to Raspberry Pi that will let you press it and have the app "listen" to your voice.
From what I understand, the echo has a specific piece of hardware in it that is 'always listening', and once triggered via voice command the echo actually begins to listen. So unless you have something connected to the device that could reproduce that initial voice analysis hardware, you cant have the 'always listening' feature.
Probably for power reasons, much like the most recent iPhones can be controlled by ; Hey Siri". However older iPhones can always listen too but are required to be plugged in to do it because they're using their main processor and doing it at a software level.

In short, always listening isn't difficult on non-battery devices, it's just a software problem.

I made an Alexa clone and use PocketSphinx to listen out for a wake word.

There's a phrase detection function you can configure to trigger audio streaming to the cloud.