Hacker News new | ask | show | jobs
by masukomi 1202 days ago
I dunno about openAI as a service, but on my M1 mac i think whisper took something on the order of 8x realtime to process with the "large" language model. That is to say... 8 minutes of processing for every 1 minute of audio. It was surprisingly not fast. I assume openAIs servers have more GPU at their disposal to make this go faster.
2 comments

Are you using whisper.cpp? You really want to be using that if you care about speed. You should be able to get better than real-time transcription on an M1.
Well that'd be why it didn't come with local transcription out of the box then. People would have called it shite!

I can edit a podcast twice as fast, never mind transcribe it! Using API calls seems like it was the best method for launch.