Hacker News new | ask | show | jobs
by throwaway314155 471 days ago
Awesome! Why do you think the reward for reading signs helped? I'm assuming the model doesn't gain the ability to read and understand english just from RL, so what purpose does it serve other than to maybe waste ticks on signs that ultimately don't need to be read?
1 comments

It's silly, but signs were a way to incentivize the agent to explore deeper into the Safari Zone among other areas.