|
|
|
|
|
by lukehack
1916 days ago
|
|
The stack is basic. I develop on an old lenovo laptop and test for a few dozen frames(you can learn a lot without a CUDA GPU) before pushing it to a desktop with a cheap nvidia card. It uses pytorch and pyboy, and the model is just a couple Conv2d expansions and compressions before hitting a Linear layer outputting predicted reward for certain keypresses(basically). The model training is based off of deep Q learning. I'm looking at a pytorch tutorial[1] when I get stuck, but I'm trying to fumble around and try it myself as much as possible before looking at it. I have an idea to have variable Q training propagation based on the amplitude of the reward so that bigger rewards propagate more, but I haven't got there yet. Here is a great video on reinforcement learning[2]. [1] https://pytorch.org/tutorials/intermediate/mario_rl_tutorial... [2] https://www.youtube.com/watch?v=93M1l_nrhpQ&t=3381 |
|