Even worse: if simulations are used, you now have two problems - formulating correct incentives and protecting against abusing flaws in the simulation.
Isn’t this true about all systems, not just “AI”? The definition of a software bug is an unintended behavior. In a large system, myriad intents overlap and combine in unexpected ways. You might imagine a complex enough system where the confidence that a modification doesn’t introduce an unintended behavior is near zero.
While obviously I've got the advantage of hindsight here, it seems like it should not have taken three days of analysis to see why the wolves were committing suicide. It seems obvious once the point system is explained. Perhaps some rubber-duck debugging might have helped in this case.
I think the point is more about highlighting the fact that AI doesn't share our base assumptions. We wouldn't think to put a huge penalty on dying because humans generally think that death is bad.
We don't receive a penalty for dying. The difference between suicidal humans and suicidal AIs is that suicidal AIs keep respawning i.e. they are immortal.
Looking at genetic algorithms makes a great comparison. In essence any algorithm in which the wolf commits suicide doesn't make it to the next generation. It's the equivalent of an enormous score penalty and 100% analog to how it works for actual life.
Genetic algorithms are based on the same reward/cost function setup. They could easily arrive at the same conclusion because suicide might be the dominant strategy.
Humans don't put a huge penalty on dying. We discount it and assume/pretend that once we've had a good long life then death is okay and euthanasia is preferable to suffering with no hope of recovery. AI wolves that can live for 20 seconds are unwilling to suffer -1 per second with no hope of sheep.
Perhaps the PhD student wasn't trying to make an AI that wins at pac-man, but investigating something else. They mention "maximizing control over environment".
One of the most typical scenarios studied in those wolf/sheep models (like http://www.netlogoweb.org/launch#http://ccl.northwestern.edu... ) is to find the best conditions for "balance" between sheep and wolf: Too many wolves and the sheep go extint and later the wolf starve. Too many sheep and then the sheep don't get enough food and also die, taking the wolves with them..
If you add your penalty, and a deficit of nearby sheep, you'd expect a trifurcation of strategy: hoarders that consume the nearby sheep immediately, explorers that bet on sheep further afield, and suicides from those that have evaluated the -100 penalty to still be optimal.
That same observation, with the exact same -100 points recommendation on crashing into a boulder, was indeed also made by a commentator on social media.
No, it's a cock up with the source of the wolves. If you could respawn endlessly after death would you fear it? You'd just want the stupid game to end before you lose points from the timer.
Let's say you are a human player playing the wolf and sheep game. The score achieved in the game decides your death in real life. Note the stark difference. Dying in the game is not the same thing as dying in real life.
If there is an optimal strategy in the game that involves dying in the game you are going to follow it regardless of whether you are a human or an AI. By adding an artificial penalty to death you haven't changed the behavior of the AI, you have changed the optimal strategy.
The human player and the AI player will both do the optimal strategy to keep themselves alive. For the AI "staying alive" doesn't mean staying alive in the game, it means staying alive in the simulation. Thus even a death fearing AI would follow the suicide strategy if that is the optimal strategy.
It is impossible conclude from the experiment whether the AI doesn't fear death and thus willingly commits suicide or whether it fears death so much that it follows an optimal strategy that involves suicide.
It’s easy to sort out in narrowly specified areas, but an extremely hard problem as the tasks become more general.