Hacker News new | ask | show | jobs
by srikanthsrnvs 2396 days ago
The reinforcement learning ones are SO common. When I worked on autonomous tech a while ago, I remember constantly trying to tweak my reward function because of the agent learning that avoiding obstacles was a high enough reward in itself, so it just learned to go in circles constantly