|
|
|
|
|
by logicprog
313 days ago
|
|
I've been using this to try to make audiobooks out of various philosophy books I've been wanting to read, for accessibility reasons, and I ran into a critical problem: if the input text fed to Kokoro is too long, it'll start skipping words at the end or in the middle, or fade out at the end; and abogen chunks the text it feeds to Kokoro by sentence, so sentences of arbitrary length are fed to Kokoro without any guarding. This produces unusable audiobooks for me. I'm working on "vibe coding" my own Kokoro based tkinter personal gui app for the same purpise that uses nltk and some regex magic for better splitting. |
|
https://github.com/nazdridoy/kokoro-tts
It generates a directory of audio files, along with a metadata file for ebook chapters
You have to use m4b-tool to stitch the audio files together into an audiobook and include the chapter metadata, but it works great:
https://github.com/sandreas/m4b-tool
I've been meaning to write a post on this workflow because it's incredibly useful