|
|
|
|
|
by birthcert
2989 days ago
|
|
You may be interested in the field of adversarial reinforcement learning. In adversarial reinforcement learning, an agent operates in the presence of a destabilizing adversary that applies disturbance forces to its system. See also the Adversarial Bandit: > Another variant of the multi-armed bandit problem is called the adversarial bandit, first introduced by Auer and Cesa-Bianchi (1998). In this variant, at each iteration an agent chooses an arm and an adversary simultaneously chooses the payoff structure for each arm. This is one of the strongest generalizations of the bandit problem as it removes all assumptions of the distribution and a solution to the adversarial bandit problem is a generalized solution to the more specific bandit problems. Good robust RL algorithms are able to learn in the presence of adversarial noise. Correct information is information that allows you to compress reality better. When an agent is able to compress reality better (has access to a better generalizing world model), it will be rewarded. Correct information is information that helps an agent better optimize its policy function. You actually hit on an interesting angle of research, and you probably will be vindicated in the near future, when adversarial images (those that fool state-of-the-art image classifiers to fail), move to adversarial agents (those that fool other agents into making bad decisions). However, this research was not about multi-agent systems, though the opponents (those that shoot fireballs and try to kill the agent) can already be seen as adversaries to the agent's goal of staying alive longer. |
|