Hacker News new | ask | show | jobs
by glinscott 1978 days ago
If anyone wants to experiment with training these nets, it's a great way to get exposed to a nice mix of chess and machine learning.

There are two trainers currently, the original one, which runs on CPU: https://github.com/nodchip/Stockfish, and a pytorch one which runs on GPU: https://github.com/glinscott/nnue-pytorch.

The SF Discord is where all of the discussion/development is happening: https://discord.gg/KGfhSJd.

Right now there is a lot of experimentation to try adjusting the network architecture. The current leading approach is a much larger net which takes in attack information per square (eg. is this piece attacked by more pieces than it's defended by?). That network is a little slower, but the additional information seems to be enough to be stronger than the current architecture.

Btw, the original Shogi developers really did something amazing. The nodchip trainer is all custom code, and trains extremely strong nets. There are all sorts of subtle tricks embedded in there as well that led to stronger nets. Not to mention, getting the quantization (float32 -> int16/int8) working gracefully is a huge challenge.

2 comments

Just wanted to say thanks for many years of fantastic work on both Stockfish and Leela. The computer chess community owes you a huge debt of gratitude!
Interesting how before A0 it was mainly "search matters the most", with crazy low branching factors to get deeper. It seems that humans were just better in search heuristics than in evaluation ones.