|
|
|
|
|
by levocardia
375 days ago
|
|
I have used whisperX with success in a variety of languages, but not with diarization. If the goal is to use the transcript for something else, you can often feed the transcript into a text LLM and say "this is an audio transcript and might have some mistakes, please correct them." I played around with transcribing in original language vs. having whisper translate it, and it seems to work better transcribing in the original language, then feeding into an LLM and having that model do the translation. At least for french, spanish, italian, and norwegian. I imagine a text-based LLM could also clean up any diarization weirdness. |
|
There are two ways to parse your first sentence. Are you saying that you used whisperX and it doesn't do well with diarization? Because I am curious of alternative ways of doing that.