| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by rebeccaskinner 439 days ago

> While it's possible to bake in this particular inductive bias (repetitive actions might be useful), they decided not to (it's just not that interesting).

What's interesting to me about this is that the problem seems really aligned with the research they are doing. From what I can tell, they build a system where the agent has a simplified "mental" model of the game world and it uses to predict actions that will lead to better rewards.

I don't think what's missing here is teaching the model that it should just try to do things a lot until they succeed. Instead, what I think is missing is the context that it's playing a game, and what that means.

For example, any human player who sits down to play minecraft is likely to hold down the button to mine something. Younger children might also hold the jump button down and jump around aimlessly, but older children and adults probably wouldn't. Why? I suspect it's because people with experience in video games have set expectations for how game designers communicate the gameplay experience. We understand that clicking on things to interact with them is a common mode of interaction, and we expect that games have upgrade mechanics that will let us work faster or interact with high level items. It's not that we repeat any action arbitrarily to see that it pays off, but rather that we're speaking a language of games and modeling the mind of the game designers and anticipating what they expect from us.

I would think that trying to expand the model of the world to include this notion of the language of games might be a better approach to overcoming the limitation instead of just hard-coding the model to try things over and over again to see if there's a payoff.