Hacker News new | ask | show | jobs
by mikeve 329 days ago
Interesting project! I've been working on a project in this space myself (WaveMemo)

I must say, speaker diarization is surprisingly tricky to do. The most common approach seems to be to use pyannote, but the quality is not amazing...

1 comments

For better diarization quality than pyannote, check out Whisper-DiarizationX which combines Whisper with ECAPA-TDNN speaker embeddings and spectral clustering.