Hacker News new | ask | show | jobs
by arthurcolle 450 days ago
Does this work well for multi-user scenarios? I also wanted to as a side effect tag and label people, but not really used to the audio setting. Just found "Speaker Verification with xvector embeddings on Voxceleb" which seems interesting and useful.
1 comments

Within constraints, yes, it does, but I think there are many improvements I could still make. Speaker diarization and identification are ongoing subjects of research and right now there's not a good end-to-end model, so if your constraints are local inference only or low latency, it can be harder to get amazing results with current hardware and off-the-shelf models. It's still a lot better than nothing.