| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by spawkfish 3802 days ago

Thanks!

Right now I have just about as close to an "extreme type C" engine as possible. There is a neural network that maps from a board position directly to a ranked list of moves it wants to make., and a little bit of logic on top of that to reject illegal moves.

The recognition of wins/draws/losses is actually done in a layer around the engine, which doesn't understand these things yet. One of "tricks" it is vulnerable to right now is being forced into a draw by repetition, because doesn't know that is a thing to avoid.

Adding a more traditional search (augmented by the network, of course) is on my todo list, and I think it will improve the playing strength a lot. I am pleasantly surprised though, at how well it plays without any of that.

1 comments

zardo 3802 days ago

It would be interesting to see how strength and "human like play" scales with the depth of the search.

There's also a really interesting possibility in training policy networks with different attributes by using games from players with certain styles of play.

link

spawkfish 3802 days ago

Training different policies in different styles is a really interesting idea. You could then have a gating process that first chooses the "style" of move to make and then uses the style-specific network to select a move.

I think getting data for this could be difficult though. I wonder how easy it would be to automatically categorize a game record by "style"?

link

zardo 3802 days ago

Or, rather than multiple policies, one policy that takes a player vector as an input along with the board position. Players that you predict will make the same move from a given board have their vectors adjusted toward each other and away from a random sample of other player vectors.

If it works, you would be able to perform player vector math ala word2vec. (No idea if it will work)

link

momerath 3802 days ago

I don't know a lot about chess, but I would try picking several prolific players with what seem to you to be different styles, and training a classifier to identify the player, as an experiment in viability.

link