|
|
|
|
|
by ldoughty
30 days ago
|
|
Really cool! But right as it was nearing 4,000, it seems to have corrupted itself and no longer got any scores above 0. Not sure if that's a code bug or a neural net issue. avg500 -4.6 last 500 episodes peak 3959.3 best window roll/s 20.68 20-step avg progress 4388 562749 episodes |
|
But at around 4K avg score you should see it solve the env almost every time.
Just a demo :) optimized for speed over stability.
Reward structure: Step: -1 Dot: +100 Win: +1000 so ~4k is max theoretical score on 6x6.