Hacker News new | ask | show | jobs
by elaus 1202 days ago
Thanks a lot for making this! Just last week I was trying out the transcription APIs of AWS and Google Cloud and they produced rather bad results for a German interview (wrong punctuation and capitalization, about 1 misheard word per sentence).

I didn't know OpenAI had an API for that as well, but now I was able to try it out and it's magnitudes better: Perfect spelling and only 1 wrong word in 2 minutes of audio (an abbreviation) that I was able to understand. It even filters out filler words!

You just saved me literally hours of work by showing the powers of OpenAI!

(Reading this back it sounds like an ad, but I'm in no way affiliated with any of those services. I'm just very happy.)

1 comments

note that you can run openAI's whisper locally. The language model and tools are open sourced. It is finicky to set up if you're not a python dev. Just wanted to let you know that it's an option and it works literally exactly as well. You can even choose if you want to sacrifice quality for speed of conversion. The experience of using it will just be a lot.... geekier. command line call that produces a text format with timestamps on every line.