Hacker News new | ask | show | jobs
by DanteVertigo 1478 days ago
"and investigate how much of the usual machinery of reinforcement learning algorithms can be replaced with the tools..."

These sounds as if the authors have confidently figured out that the current reinforcement learning formulation is not good enough.

On the other hand, I think the recent large language models have showed us that much of the world knowledge is indeed predictive. That, if you can predict accurately (next words), you can understand higher more abstract things. The hypothesis that much of world knowledge is predictive, is very important in the framework of reinforcement learning because that means that with enough General Value functions learned off-policy, one can predict almost anything about the world that is useful to the agent in achieving its goals. (cf the Horde paper).

1 comments

> much of the world knowledge is indeed predictive ... if you can predict accurately (next words), you can understand higher more abstract things

What do you mean with "understand"? And why are you calling "knowledge" that which is predictive?

> What do you mean with "understand"?

By "understand" I mean the knowledge that is missing to the agent in order to control the environment toward achieving its goal. Reinforcement learning is concerned with this sort of interaction, between an agent (a decision maker) and an (unknown) environment. The (only) goal of the agent is to maximize its cumulative sum of reward (cf the Reward hypothesis(.

> And why are you calling "knowledge" that which is predictive?

No, I do not think I am saying that, but if it comes across like that, let me be more precise.

I mean that most (but not all) of the world knowledge is predictive. An example of not-predictive knowledge is factual knowledge, like mathematics. Knowledge being predictive is important for an autonomous decision maker because the knowledge can be verified solely by the agent (not a teacher, as it is in the framework of supervised learning). One crucial thing to understand about the framework of reinforcement learning is that it makes the agent solely responsible for its way of behaving. As it should be. Then, an effective way to do that in a scalable way (to not rely on some oracle teacher or anything else) is to be able to verify any knowledge that the agent wants to acquire, in order to achieve its goals.