| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by georgehm 3575 days ago
	"After training, we can sample the network to generate synthetic utterances. At each step during sampling a value is drawn from the probability distribution computed by the network. This value is then fed back into the input and a new prediction for the next step is made. Building up samples one step at a time like this is computationally expensive, but we have found it essential for generating complex, realistic-sounding audio." So it looks like generation is a slow process.