| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by yvdriess 2911 days ago

Entirely speculation, but that is typically what you do when finding a cause of the RL/NN model's behavior :)

If you consider the action state space, removing the 'do-nothing' state could provide a learning benefit. Consider a set of models that happen set the do-nothing state weights to zero, but manage to achieve a similar action using quick left/right movement. Perhaps these models train slightly better, meaning that in the same number of iteration steps they get to a better score than the models that do consider the do-nothing state.

Checking the video, you do see the person waiting from time to time. Perhaps this is an artifact of its demonstration learning episodes?

I do not see their Dota2 bot do this jitter movement. (Interestingly, the official/vanilla Dota2 bot does have this jitter!) This is likely because there is a benefit in being economical in your movements in that game: turning takes time. I postulate that an OpenAI bot for League of Legends, where turning is instant and free, would exhibit the jitter movements ;)

edit: Alternatively, inspired by the 'fly-by-wire' sibling comment: maybe spamming the emulator with left/right actions does provide a slight benefit. It wouldn't be the first time an AI finds video game exploits[1].

[1] https://arstechnica.com/gaming/2013/04/this-ai-solves-super-...