Hacker News new | ask | show | jobs
by BugsJustFindMe 3405 days ago
Can you comment on what's happening in this sample (result???) clip? http://www.josesotelo.com/speechsynthesis/files/wav/blizzard...

Also, I notice that many of the result clips trail off in volume. Is that a processing error or intentional in how the clips are edited?

1 comments

I think the model just got tired of reading text and decided to mock us :) Just kidding. The attention mechanism got stuck somehow for this sample. This does not happen very often, though. It's important to note the samples we posted were not cherry-picked: they are just the first 10 sentences from our test set.

Regarding the truncation at the end, that was a bug in our sampling code that we just fixed. We will update the samples soon!

Is there any way to artificially induce that failure? I'm an artist and I've been trying to get a handle on ML stuff, and being able to feed speech through this to give it the flat affect of the phoneme-mode samples, or insert attention failures at specific points, would be extremely useful for a number of projects I have in mind.