Hacker News new | ask | show | jobs
by viggity 291 days ago
I feel like this is a step in the right direction, but a lot of emotive text-to-speech models are only changing the duration and loudness of each word, the timing/pauses are better too.

I would love to have a model that can make sense of things like stressing particular syllables or phonemes to make a point.

1 comments

this model is superb