| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by meerab 334 days ago

I am building VideoToBe.com - I have found that whisperX works the most reliable.

It is built on top of OpenAI Whisper, so speech recognition is good, the transcript gives speaker tags as 'SPEAKER_00' and 'SPEAKER_01' etc.

Here is how the transcript may look like