| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mpdaugherty 489 days ago
	We did a lot of work at https://www.quillmeetings.com to build a diarization & speaker recognition pipeline that works locally on mac and windows. Basically, we can create embeddings of parts of the audio, like you might create embeddings for text for a RAG system, and cluster them (simplifying a lot of details from the "last 80%" that has taken a lot of effort to get working...) The speaker recognition can't be as perfect as listening to each stream separately like Zoom itself can do, but it also learns your contacts over time and can recognize voices for ad-hoc in-person meetings, etc. which I've found really magical since we launched it.

2 comments

jtswole 489 days ago

Ah yes, a locally-run, mostly-accurate speaker recognition pipeline that isn't open source. Love to see cool features locked away while the rest of us plebs make do with whatever scraps the OSS world has managed to build. But hey, at least it kind of works, so you can enjoy your slightly-wrong diarization in private.

Truly the future of meetings.

link

prollyjethi 489 days ago

not open source :/

link