| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by clickety_clack 351 days ago
	Please find a way to add speaker diarization, with a way to remember the speakers. You can do it with pyannote, and get a vector embedding of each speaker that can be compared between audio samples, but that’s a year old now so I’m sure there’s better options now!

1 comments

yujonglee 351 days ago

yeah that is on the roadmap!

link

williamsss 350 days ago

I’ve done something similar recently, using speaker diarization to handle situations where two or more people share a laptop on a recorded call.

Ultimately, I chose a cloud-based GPU setup, as the highest-performing diarization models required a GPU to process properly. Happy to share more if you’re going that route.

link

clickety_clack 350 days ago

What model did you use for diarization?

link