Hacker News new | ask | show | jobs
by leet 2495 days ago
Yep. That is what Deep Q Network does.
1 comments

Yeah that's a good example. Say for alpha go for instance. Without the DL network the q learner would be massive. And that's just for a board game. Imagine trying to do that for systems like moving human body parts. Every single body configuration would be a state and then you have every single possible action from each state.
The table becomes unwieldy on much simpler tasks than that.

Consider a 3x3 board where each cell holds 3 bits of information (each cell can be in 2^3 states). Then for the board you have (2^3)^9 = 2^27 different states.

Then multiply that by how many actions you have per state. We'll suppose 9 because you can only change one tile at a time. Then multiply that by 4 bytes assuming we are using a float instead of a double and you get 4.8 gigs of memory for whatever this simple problem is.
you mathed wrong. 3x3 board with 8 states per cell is 72 total states. 9x8,not 8^9.

Edit, I just considered : Unless you mean that the state is the combination of all the cells. Then you are right

Yes the state is the entirety of the boards configuration.