Did a writeup here about it: https://notes.jasonljin.com/projects/2018/05/20/Training-Alp...
https://github.com/likeaj6/alphazero-hex