Hacker News new | ask | show | jobs
by emmanueloga_ 805 days ago
whisper.cpp [1] has a karaoke example that uses ffmpeg's drawtext filter to display rudimentary karaoke-like captions. It also supports diarisation. Perhaps it could be a starting point to create a better script that does what you need.

--

1: https://github.com/ggerganov/whisper.cpp/blob/master/README....