Hacker News new | ask | show | jobs
by java_beyb 1098 days ago
what you're looking for is called diarization. almost all enterprise STTs do that, you can find individual libraries on GitHub too.

fine-tuning whisper is a nightmare, I don't know what the interviews are for, but again most enterprise STTs offer customization. you can add medical terminology.

---Google, Amazon and Nuance have medical models but either expensive or not available for personal projects.

1 comments

Thanks for that! Searching for diarization really helped me narrow down for what I was looking for.