|
|
|
|
|
by jallenjia
524 days ago
|
|
I'm excited to share Kokoro TTS, an open-source text-to-speech model we've been working on. Despite its relatively small size (82M parameters), it achieves impressive results in natural speech synthesis, ranking first in the TTS Spaces Arena benchmark. The model is Apache 2.0 licensed and trained on less than 100 hours of audio data. It supports both American and British English, offering multiple voice options with natural emotional expression and 24kHz audio output. We've deployed a demo at kokorotts.online where you can try it out. I'd really appreciate any feedback from the HN community on both the model's performance and potential applications. Tech stack: StyleTTS 2 architecture, ONNX runtime, Next.js for the web interface. |
|