|
|
|
|
|
by ogrisel
4212 days ago
|
|
Actually: > Faster evaluation function: I didn’t use the GPU for playing, only for training. So indeed it could probably be optimized. But first I would profile as it can be the case that the bottleneck is Matrix Matrix multiplication in numpy which already delegates computation to an optimized third party library (e.g. OpenBLAS). Maybe it's worth using the GPU at play time as well. |
|
The difficulty there is that sending data to and from the GPU is slow. You need to avoid data transfers, which might mean trying to do everything on the GPU. But the SIMD cores on the GPU are likely to perform poorly due to all the branch statements in chess code.