Is there anything like this available to run locally? Our HR dept wants to use something to transcribe interviews but doesn't want to submit data to some random website.
I use whisper and pyannote (https://github.com/m-bain/whisperX), but it is a pain to run locally - I run it on a 4080. This seems to be actually trying to identify the speakers. Not sure what they are doing for that.