The method was called 'empowerment'. Two ways to explain it...
From a mathematical perspective, we used Information Theory to model the world as an information theoretic 'loop'. The agent could 'send' a signal to the world by performing an action, which would change the state of the world; the state of the world was what the agent 'received'. This obviously relies on having a model of the world and what your actions will do, but doesn't burden the model with other biases.
Pore more colloquially, the agent could perform actions in the world, and see the resulting state of the world (in my case, that was the location of the agent and of the ghosts). Part of the principle was that changes you cannot observe are not useful to you.
In an active inference approach you would have the agent minimise surprisal. Choose the action that is most likely to produce the outcome you predicted.
The approach I used was similar. The idea of maximising observed control of the world means you seek states where you can reach many other states, but _predictably_ so. This comes 'for free' when using Information Theory to model a channel.
Do you have any reading you'd recommend related to this?
I naively thought it would be some kind of Kalman filtering of sorts but from what I gather in your words it doesn't even have to be "that" complicated, right?
What's the tradeoff between "delete all state in the world with 100% certainty" and "be able to choose any next state of the world with (100-epsilon)% certainty"?
In Information Theory, there is a concept of Channel Capacity. If a channel is defined as the probability of the output being s if you send a, across all possible values of a, then the Channel Capacity is the maximum amount of information you can communicate across this channel, measured in bits.
To achieve the Channel Capacity you need to find the optimum distribution across a - i.e. what set of signals maximises the information you can transmit on this channel. There are known algorithms for finding this distribution (e.g. Blahut-Arimoto).
Now if you model the world as a channel, where s represents the reachable states and a represents the actions the agent can take (and the channel, P(s|a), represents the dynamics of the world), you can calculate what actions allow you maximal control (in terms of states you can controllably reach).
From a mathematical perspective, we used Information Theory to model the world as an information theoretic 'loop'. The agent could 'send' a signal to the world by performing an action, which would change the state of the world; the state of the world was what the agent 'received'. This obviously relies on having a model of the world and what your actions will do, but doesn't burden the model with other biases.
Pore more colloquially, the agent could perform actions in the world, and see the resulting state of the world (in my case, that was the location of the agent and of the ghosts). Part of the principle was that changes you cannot observe are not useful to you.