Hacker News new | ask | show | jobs
by dresaj8 3394 days ago
does anyone know of good ways to do the opposite, speech to text?
5 comments

Not really. I keep my eye on this area as I generally transcribe my podcasts. But compared to ~$1.50/minute for human transcriptions that require minimal touchup for even fairly tech-heavy topics, nothing I've seen that's purely ML/computer-based comes close to being worth my time to deal with.
Depends on how good you're talking. Chrome supports the SpeechRecognition API.

https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecog...

i'm more thinking of ways to programmatically turn long audio files into indexable text.
Julius[1] can do this. But the accuracy depends on the language model you are using, and unfortunately the free English language model (VoxForge) is not the best.

[1] http://julius.osdn.jp/en_index.php

I'm unaware of a ML-based solution, but GCE has an endpoint that _can_ do this, though it is better at short sentences.
Google Cloud had a speech api and supports 80 languages. There is a demo: https://cloud.google.com/speech/
Lex by AWS. Its the same deep learning tech. used as used by Alexa
does lex actually translate speech to text for you? i was under the impression that it was for conversational bots.
Yea.. you are right.. I just assumed that would fit any use case where speech to text is needed. Which clearly is wrong.
Doesn't google api work for you ? I thought it worked perfect