| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kastnerkyle 3571 days ago
	Relatively, training is fast (due to parallelism / masking so you don't have to sample during training) but during generation sampling is a sequential process. They talk about it a bit in the previous papers for PixelCNN and PixelRNN.