Hacker News new | ask | show | jobs
by cypher543 3361 days ago
Right, even a good diphone voice needs lots of data. And I noticed they trained it with the existing Google Home voice actress, from whom they must already have many, many hours of recordings. I was mostly asking about the model itself; whether you could download TensorFlow and put one together based on this paper alone.
1 comments

I see your points. But it is related. Even if u get what you think the paper describes, it is hard to know whether you did it right or not, because you cannot replicate the result easily. This happens in a lot of CV papers already, where people reimplement the model, but it never get as good as the paper demonstrated

But, you have a very good idea. Since it is Google Home, will it be possible that some people just buy hundreds of them, and infinitely ask them question to gather the training data? That will be interesting to see.