Hacker News new | ask | show | jobs
by zuminator 2234 days ago
They concluded that their model "is capable of generating pieces that are multiple minutes long, and with recognizable singing in natural-sounding voices." Which part of that is dishonest? I would assert that being able to make sense of the lyrics is a nice bonus but not fundamentally relevant to their conclusion, in that a person can appreciate singing in a foreign language, and recognize it to be natural, without any knowledge of the words whatsoever. Besides speech synthesis in terms of intelligibility is basically solved, that's not really the thrust of what they've achieved here.

And more to the point, a full 815 of the uploaded songs have no pre-written lyrics, so your premise that they are reliant on "karaoke style lyrics" is mistaken to begin with.