| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by corobo 1202 days ago
	How's it handling long files? Let's say worst case scenario, a 2 hour long podcast. What ratio are you getting (podcast length to transcription time) and does it error out memory wise as others suggest?

1 comments

masukomi 1202 days ago

I dunno about openAI as a service, but on my M1 mac i think whisper took something on the order of 8x realtime to process with the "large" language model. That is to say... 8 minutes of processing for every 1 minute of audio. It was surprisingly not fast. I assume openAIs servers have more GPU at their disposal to make this go faster.

link

johtso 1193 days ago

Are you using whisper.cpp? You really want to be using that if you care about speed. You should be able to get better than real-time transcription on an M1.

link

corobo 1201 days ago

Well that'd be why it didn't come with local transcription out of the box then. People would have called it shite!

I can edit a podcast twice as fast, never mind transcribe it! Using API calls seems like it was the best method for launch.

link