| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by synesthesiam 1761 days ago

You might give Larynx a try: https://github.com/rhasspy/larynx

Demo: https://youtu.be/hBmhDf8cl0k

(I'm the author)

3 comments

follower 1761 days ago

Wow.

I'd really encourage you to invest some time into SEO and promotion of your project.

I spent a bunch of time recently looking for exactly this: TTS, offline, an Open Source licence, and with "decent"/"natural" sounding default voices.

The "best" I ended up finding was `espeak-ng` but, really, the "natural"-ness is barely comparable to what Larynx seems to produce--based on a quick listen to the demos here: https://rhasspy.github.io/larynx/#en-us

On first impressions at least, Larynx definitely seems to be a project that desires a higher profile in this space.

Thanks for sharing the project here, I'll be interested to take a deeper look when I circle back to my side-side project that could benefit from it. :)

(BTW I didn't watch/listen to the YT video all the way through yet but if the narration is generated by Larynx (which it seemed it might be?) it's definitely worth stating that up front.)

Oh, also, really appreciate that there's multiple options for non-male voices too which is something that seems to be sorely lacking in similar projects.

link

infinite8s 1761 days ago

Yes agreed, this is great! The best I found that could generate faster than real-time without a GPU was speedyspeech (https://github.com/janvainer/speedyspeech). Unfortunately it was only trained using the LJSpeech dataset and I haven't been able to transfer to a multi-voice model. I have been using it to build an story-telling app for my kids.

link

follower 1761 days ago

> been using it to build an story-telling app for my kids.

Oh, that's cool! :) Has some overlap with part of my interest in TTS technologies.

The existence of 50 voices for Larynx is definitely a significant part of what makes it an exciting development in this sphere of use.

link

follower 1761 days ago

So, after actually watching the demo video all the way through, it seems the video is narrated by a Larynx-produced voice "southern_english_female": https://youtu.be/hBmhDf8cl0k?t=387

While I appreciate the cinematic aspect of the confirmation/"reveal" at the conclusion of the video, ironically--because the quality is so good--since most people won't get through a 7 minute video, it would be entirely possible to not realise the narration itself is generated.

So I'd encourage you to consider stating it a little more up front.

(Also, based on the release dates it seems like this project is relatively young which explains why it's not very well known at this point.)

link

JZL003 1761 days ago

That is pretty nice, one of the best collection of voices I've seen and the best interface

Google gives 1 million characters per month free which I don't often go over, but this will be really useful for when I do

I don't want to be unappreciative, it's amazing that this is possible much less free, but when you spend hours listening to it every day, the cracking and warbling do get old. I think there are better models I've heard snippets of but the truly amazing thing about google's is how robust it is to very weird words

(When I tried all the public cloud offering's, IBM's was the marginally nicest AFAICT but it was the most expensive with least free quota)

link

JZL003 1761 days ago

Yeah https://cloud.ibm.com/catalog/services/text-to-speech it's so smooth

link

czottmann 1761 days ago

That is ace. Thanks for sharing!

link