Does anybody know how long it would take to train an alphazero go version using one gpu? In [1] they claim that it took 13 hours until the model was able to beat the original alphago version, but they don't state what hardware they used.
The ELF OpenGo paper[1], which is an open implementation of AlphaGo Zero developed by Facebook AI:
"First, we train a superhuman model for ELF OpenGo. Af-ter running our AlphaZero-style training software on 2,000GPUs for 9 days, our 20-block model has achieved super-human performance that is arguably comparable to the 20-block models described in Silver et al. (2017) and Silveret al. (2018)."
I agree with the quoted numbers. As I mentioned in another comment, you have to keep in mind that AlphaZero is an extremely sample-inefficient learning technique, even for simple problems. However, it has two major strengths: 1) it is pretty generic and 2) it can leverage huge amounts of computing power.
The ELF OpenGo paper[1], which is an open implementation of AlphaGo Zero developed by Facebook AI:
"First, we train a superhuman model for ELF OpenGo. Af-ter running our AlphaZero-style training software on 2,000GPUs for 9 days, our 20-block model has achieved super-human performance that is arguably comparable to the 20-block models described in Silver et al. (2017) and Silveret al. (2018)."
[1]: https://arxiv.org/pdf/1902.04522.pdf