| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by TTPrograms 2399 days ago
	The main issue there is just that if you see something in operation sufficiently different from training you may violate those penalties. Eg. you can train an arm not to hit a person in simulation by penalizing it, but that doesn't guarantee there's not an input that would still cause the safety violation. Generalization in these regards can be still be shockingly bad for modern approaches.