| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by qwertywert_ 1840 days ago

Yea. For animals, reproduction and just surviving is the reward function?

It talks a lot about having a rich enough environment for learning which makes sense, if a computer lives only in a Go board it can only learn go playing itself.

How do you simulate a rich enough environment purely in software (or do you sense input from the "real" environment) and what reward do we define in this complex environment.. It seems to ask those 2 questions in the discussion but kind of glosses over them imo.

1 comments

visarga 1840 days ago

You put many agents in the same environment, agents are both actors and the environment.

link