Just because an agent “lives” in the environment, doesn’t make it RL. It needs a reward function, or even better something like Gym.