Hacker News new | ask | show | jobs
by infogulch 1454 days ago
What is the most compact representation for a complete chess game state? Can it be as low as 64 bits?
6 comments

It cannot(at least not efficiently). There's 64 squares which could contain any one of 6 pieces or nothing. So you would need multiple u64s. In addition you need to store things like castling rights since they depend on previous moves.

Engines these days tend to use bitboards, which is just keeping multiple u64s each corresponding to a board with all the white/black knights, kings, pawns, etc. This is then manipulated using various bitwise operations.

Edit: It's also worth noting that compactness is not the most important thing for board representation. Stockfish for instance only stores one copy of the position per thread, so the effect of any overhead there is negligible. It's much more important that querying and mutating the board state is fast.

I don't agree with the reasoning in either of the other two answers you've been given, but Shannon (1950) gave the number of chess positions as somewhere around 10^43, so 150+ bits. The position accounts for almost all of the state.

You could do better with variable-length encodings and perhaps represent all "interesting" "plausible" positions in 64 bits, but as others said, compactness is really not a top priority in representations for chess engines, it's much more important to be able to query and update the representation.

Shannon's estimate was based on very primitive methods; by generating random positions and using fairly advanced methods to see whether they are legal or not (ie., can you construct a proof game for it, or prove that it could never happen), you will get much closer. A group of people have been working on this, and their current best estimate is (4.822 +- 0.028) * 10^44, or a bit over 148 bits. (Amazingly enough, Shannon wasn't all that far off on this account! His estimated number of legal games seems much more dodgy, though.)

http://talkchess.com/forum3/viewtopic.php?f=7&t=77685&sid=e3...

Practically speaking, https://github.com/tromp/ChessPositionRanking gives a number between 0 and approx. 8.7 * 10^45 for any legal position, so it's only a couple of bits away from optimality.

> a bit over 148 bits

That would be 149 bits!

Did you double check the math?
Let me check again:

148 bits + 1 bit = 149 bits

> The position accounts for almost all of the state.

I would argue that a complete game state depends on previous game states for the sake of repitition stalemates. The current state must include a set of previous positions that can possibly reoccur.

Having nerded out on this question in the past, I found that Wikipedia had one of the better discussions, with a Huffman encoding scheme that seemed like it should be near optimal. Someone has since deleted it but it can be found in the history

https://en.m.wikipedia.org/wiki/Special:Diff/916225714

Heres a few naive options to get a ballpark idea:

A) fixed list of piece coordinates: 32 pieces × (3 bits for row + 3 bits for column + 1 bit for whether it's alive) = 224 bits

B) list of live pieces: (3 bit row + 3 bit column + 1 bit for color + 3 bits for which piece) × up to 32 pieces = up to 320 bits

C) 8x8 array of the board: 64 squares × (1 bit for color + 3 bits for which piece) = 256 bits

Then there's the gamestate that isn't directly visible on the board:

- whose turn it is: 1 bit.

- castling rights: A needs 2 bits. B and C can pack it in with the 3 bits for pieces. there's 6 pieces so there's room for two more, one of which is a rook that's eligible to castle with.

- en passant: A needs a flag for each pawn so 16 bits. B and C can also pack in with the other spare piece being a pawn that just advanced two spaces.

- promotion: B and C get it free. A needs another 3 bits per pawn to indicate if and what it's promoted to (so 48 bits max)

- 50 moves without capture/pawn move optional stalemate: 6 bit counter from 0 to 50.

- 75 moves without capture/pawn move forced stalemate: make that 7 bits

- repitition stalemates: don't see a way around needing to know the previous board states. you need a copy of everything except the 50 move counter for previous turns. you can prune states that are impossible to return to.

A: 301 + (294 × #repeatable_states) bits

B: 328 + (321 × #repeatable_states) bits

C: 264 + (257 × #repeatable_states) bits

I'm sure there's savings to be made further compressing any of these with variable length encodings though (see links in other replies). I especially like the idea of encoding information as illegal states.

edit: originally said 4 bits for row and column but it's 3.

This is pretty tremendous: https://codegolf.stackexchange.com/questions/19397/smallest-...

Chess computing is one area where there’s a lot of attention to hard details as well as the algorithms, lots of gems like this in the genre

Since the board itself is 64 squares and there's many pieces that can move to any square on the board it can't be that low.