Hacker News new | ask | show | jobs
by skoocda 3639 days ago
I'm working on this at the moment, in a way that also uses the edited transcripts to train our ASR system to perform better for later sessions. The difficult part is the speaker diarization, however. Multiple people talking at once requires some intricate signal processing to sort out.