| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sillysaurusx 1171 days ago
	According to the AlphaZero paper (https://arxiv.org/pdf/1712.01815.pdf) they ran game logic on TPUs: > Training proceeded for 700,000 steps (mini-batches of size 4,096) starting from randomly initialised parameters, using 5,000 first-generation TPUs to generate self-play games and 64 second-generation TPUs to train the neural networks. Further details of the training procedure are provided in the Methods.