| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jefft255 2399 days ago
	That works but to learn to avoid these "bad" things, in the setting you describe, the agent has to first make those mistakes and learn from them. There are mistakes we don't want the agent to make, ever. That's what safe RL is about.