Hacker News new | ask | show | jobs
by jilijeanlouis 781 days ago
Our API, Gladia, supports speaker diarization. We use a hybrid enterprise-grade ASR system for speech-to-text, with our own version of Whisper at its core, and state-of-the-art open source models for diarization. We process large audio files, and use a proprietary algorithm so that our users are not billed extra for duplicate audio channels, as many other providers do. Hope this helps. There's a free trial if you'd like to test, and here's our blog with more info: https://www.gladia.io/blog/gladia-speech-to-text-api-speaker...