Hacker News new | ask | show | jobs
by corobo 1202 days ago
How's it handling long files? Let's say worst case scenario, a 2 hour long podcast.

What ratio are you getting (podcast length to transcription time) and does it error out memory wise as others suggest?

1 comments

I dunno about openAI as a service, but on my M1 mac i think whisper took something on the order of 8x realtime to process with the "large" language model. That is to say... 8 minutes of processing for every 1 minute of audio. It was surprisingly not fast. I assume openAIs servers have more GPU at their disposal to make this go faster.
Are you using whisper.cpp? You really want to be using that if you care about speed. You should be able to get better than real-time transcription on an M1.
Well that'd be why it didn't come with local transcription out of the box then. People would have called it shite!

I can edit a podcast twice as fast, never mind transcribe it! Using API calls seems like it was the best method for launch.