Hacker News new | ask | show | jobs
by follower 1761 days ago
Wow.

I'd really encourage you to invest some time into SEO and promotion of your project.

I spent a bunch of time recently looking for exactly this: TTS, offline, an Open Source licence, and with "decent"/"natural" sounding default voices.

The "best" I ended up finding was `espeak-ng` but, really, the "natural"-ness is barely comparable to what Larynx seems to produce--based on a quick listen to the demos here: https://rhasspy.github.io/larynx/#en-us

On first impressions at least, Larynx definitely seems to be a project that desires a higher profile in this space.

Thanks for sharing the project here, I'll be interested to take a deeper look when I circle back to my side-side project that could benefit from it. :)

(BTW I didn't watch/listen to the YT video all the way through yet but if the narration is generated by Larynx (which it seemed it might be?) it's definitely worth stating that up front.)

Oh, also, really appreciate that there's multiple options for non-male voices too which is something that seems to be sorely lacking in similar projects.

2 comments

Yes agreed, this is great! The best I found that could generate faster than real-time without a GPU was speedyspeech (https://github.com/janvainer/speedyspeech). Unfortunately it was only trained using the LJSpeech dataset and I haven't been able to transfer to a multi-voice model. I have been using it to build an story-telling app for my kids.
> been using it to build an story-telling app for my kids.

Oh, that's cool! :) Has some overlap with part of my interest in TTS technologies.

The existence of 50 voices for Larynx is definitely a significant part of what makes it an exciting development in this sphere of use.

So, after actually watching the demo video all the way through, it seems the video is narrated by a Larynx-produced voice "southern_english_female": https://youtu.be/hBmhDf8cl0k?t=387

While I appreciate the cinematic aspect of the confirmation/"reveal" at the conclusion of the video, ironically--because the quality is so good--since most people won't get through a 7 minute video, it would be entirely possible to not realise the narration itself is generated.

So I'd encourage you to consider stating it a little more up front.

(Also, based on the release dates it seems like this project is relatively young which explains why it's not very well known at this point.)