| Similar story of unexpected AI outcomes... As part of my PhD research, I created a simplified Pac-Man style game where the agent would simply try to stay alive as long as possible whilst being chased by the 3 ghosts. The agent was un-motivated and understood nothing about the goal, but was optimising for maximising its observable control over the world (avoiding death is a natural outcome of this). I spent sometime trying to debug a behaviour where the agent would simply move left and right at the start of each run, waiting for the ghosts to close in. At the last minute it would run away, but always with a ghost in the cell right behind it. Eventually, I realised this was an outcome of what it was optimising for. When ghosts reached cross-roads in the world they would got left or right randomly (if both were same distance to catching the agent). This randomness reduced the agent's control over the world, so was undesirable. Bringing a ghost in close made that ghost's behaviour completely predictable. |
The cars' "fitness" function rewarded cars for driving along the course and punished them for crashing into walls. But evidently this function punished a little too severely: the most successful cars would just drive in tight circles and never make progress on the course. But they were sure to avoid walls. :)