| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by rickoooooo 1302 days ago

I played with Mycroft about two years ago. I had been using a couple Google home minis for a while for the usual things (play spotify, set timers, ask the weather, control lights around the hose). They worked perfectly for that. At the time I decided to de-Google my life and take back my privacy so I went looking for something open source that would provide me more control of my data. I found Mycroft and played with it for a few months.

I was pretty excited about it. I bought a ReSpeaker 2.0, which is an embedded device that can run Linux and has a six microphone array. I designed a custom 3d-printed case to hold the ReSpeaker and a small speaker to make my own little "Jarvis" box (Iron-man reference).

My favorite part about the whole thing was the customization. I wrote a couple of skills to do some other things for me. For example, I could say "Where can I watch X?" and it would use an API to search for a TV show or movie to see where it was available on Netflix, Amazon Prime, Disney+, etc and let me know. It's always been annoying to go Google and try to figure out where I can watch something streaming online, but limited to only the services I currently subscribe to. I wrote another skill that tied into my couchpotato instance so I could say "Download the movie X" and it would go find it and download it. If it found multiple matches, it would read off the top few matches and let me choose the correct one. I even tied those skills together so if the first skill couldn't find a movie at one of my streaming services it would ask if I wanted to download it and I could simply say "yes". I also modified the code to use a custom text to speech API so I could configure Mycroft to use a custom voice.

It was all really cool and I had a lot of fun playing with it. The biggest problem I ran into was the wake word recognition. It worked mostly OK for me on the ReSpeaker from close range but I found as I moved away it went downhill. It was especially bad if I had my device playing music, which is possibly the most common thing I was using my Google Home mini for. I had hoped that the ReSpeaker would help with this, because it had the six microphone array and some built-in loopback hardware to try and cancel out any noise that that was being generated by the ReSpeaker. So any sound output to the speakers would be looped back into the ReSpeaker and could be subtracted from the microphone's input. I found that I just couldn't get it to work well, though. I think the music was causing vibrations that were overloading the microphone array and causing it to be unable to hear me through the music. It's possible it could be improved with a better hardware design to help reduce vibration caused by the device's own speaker. Maybe it works better now, two years later. I think I had configured Mycroft to use Snowboy for wake-word recognition so I could name my Mycroft something else (Jarvis).

One day the Mycroft installation just stopped working on my device after I hadn't touched it in a week or more and I never went back to figure out what was wrong. It's still sitting on the corner of my desk unplugged. If I could have got the wake-word recognition working reliably with music playing I think I would have used it a lot, but I wasn't able to at the time.

I just recently bought a smart watch with a built in "Alexa" app that allows you to send voice commands to your phone which get processed through the watch's official app. I'm instead using Gadgetbridge on Android to interface to the watch. Some kind hacker updated Gadgetbridge to add very basic support for my watch's microphone, allowing you to send the raw voice data to an external application. I'm hoping I'll be able to use this to revive my Mycroft instance and I'll just send voice commands to Mycroft from my watch/phone via a custom Android app/service. In theory, I'll be wearing the watch all the time anyway and having the microphone on my person and right next to my face should hopefully help with the speech-to-text and I won't have to worry about a wake word at all. I've only just barely started working on this, though.

1 comments

adam1028 1302 days ago

I gave up on mycroft after a long wait and built my own with respeaker and picovoice. i have 2 of them with different wake words. imo it's way better and easier than snowboy. i dont understand why people give their data to amazon to set a timer :)

check their free stuff: https://picovoice.ai/pricing/

link

rickoooooo 1302 days ago

You are using picovoice as the assistant? Is it en entire solution for that? Or are you running a DIY Mycroft device with picovoice as the wake word detector? I'll have to check this out but I've been trying to stick with open source technologies where I can. I don't trust that a free tier will remain free forever, but it may be worth testing out.

link

adam1028 1302 days ago

oh no, i meant mark II, the speaker. I use their picovoice sdk. it has wake word and intent detectors - Porcupine and Rhino. https://picovoice.ai/docs/picovoice/ and https://ttsmp3.com/

i was going to add picovoice's speech to text with rasa https://rasa.com/ but i didnt have time, will give it a try over the holidays.

I see your point but not every open source project lives forever, like sonos killed snips.

link

adam1028 1302 days ago

seems rhasspy is good too https://news.ycombinator.com/item?id=22703035

link