| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by basve 3565 days ago
	And I should add that this was measured using a downsized model (just two blocks of dilated convolutions and a sampling rate of 4khz). Deepmind's paper does not report how many stacks are used to generate the samples, but I assume it's quite a bit more.

1 comments

unlikelymordant 3565 days ago

They (deepmind) reported it took 90 minutes of processing to generate 1s of speech via tweet. Hopefully this comes down in the future.

link

espadrine 3565 days ago

Do you have a link?

This implementation says: “A Tesla K80 needs around ~4 minutes for generating a second of audio at a sampling rate of 4000hz”, which is significantly faster.

link

basve 3565 days ago

90 minutes for 1s of audio was reported by someone from Google on twitter, but the tweet has been deleted. I've clarified in the readme that my measurements are for a much lighter/smaller model than Deepmind's :).

link