|
|
|
|
|
by wahnfrieden
326 days ago
|
|
How is it not the case? It is unusable without VAD or editing. I don't understand what you're questioning I agree their products could be better "end to end" integrated. Meanwhile there is a continuously-improving field of work for detecting speech (which Whisper is incapable of). They offer official "cookbooks" with guidance on an approach they recommend: https://cookbook.openai.com/examples/whisper_processing_guid... > At times, files with long silences at the beginning can cause Whisper to transcribe the audio incorrectly. We'll use Pydub to detect and trim the silence. (Official OpenAI quote) |
|