Hacker News new | ask | show | jobs
by thomasahle 4212 days ago
One interesting line of research, I think, is using 1 or 2 layered networks to 'simulate' more complex evaluation functions. If you could train such a network to get within a 10% error of Stockfish's evaluation, then you might be able to distil that network as a faster evaluator to plug back into Stockfish for an even stronger engine. As you say, one hard problem is probably finding actually interesting positions to sample for the training.

Anyhow, it's fun to see how engines like these battle it out. It may also be that your approach can yield a more 'fun to play' engine for us mortals.

1 comments

I think that's pretty useful approach. It's kind of similar to Hinton's latest work on model compression: http://www.ttic.edu/dl/dark14.pdf

The problem with deep models is when you end up having more than 1 hidden layers, you have a big matrix multiplication to get between the layers. If your hidden layers are a few thousand units, that's still pretty slow. Doing things in minibatches or on the GPU speeds it up significantly, but I'm guessing it's still orders of magnitudes slower than whatever Stockfish uses

Sure, the second layer would have to be very sparse. That makes sense since most multi-piece 'chunks' are not really that interesting.