Hacker News new | ask | show | jobs
by gcoakes 1098 days ago
I'm excited about them making it faster to produce. I finished the most recently published audiobook in a series this weekend. The author posts unpublished chapters to a site called Royal Road. I listen to books while running and driving, so it's a non-starter to visually read them. It would be nice to have that pipeline accelerated.

Now, I just want to talk about my little weekend project... I spent a couple of hours scraping Royal Road and trying to get TTS working. Eventually, I settled on:

1. `wget --recursive` filtering only the chapters 2. A python script to strip extraneous html like advertisements and the headers. 3. Pipe into pandoc emitting plain text. 4. Copy it to my phone for TTS: https://f-droid.org/packages/com.danefinlay.ttsutil/

I really wanted to use all local tools, but I just couldn't get any of the Linux tools to sound as good or work as fast as Google TTS services. Also, the TTS paid services I found were just too expensive to justify (20hr book for ~$70).

I'm more than happy to additionally purchase the audiobook when it is published. I just don't want to wait.

2 comments

Just FYI since you ended up using an external tts anyway -

https://beta.elevenlabs.io/speech-synthesis

is vastly better, especially for fiction.

Also worth trying is: https://speechify.com/

Yeah, it isn’t so much that I want publishers to have a cheaper way of making an audiobook that avoids the (apparently minimal) cost of employing a voice actor.

I don’t want to wait for the publisher to decide they want to do an audiobook.