| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jah242 1528 days ago

There might be some truth in what you say for very large image and language models that use supervised learning.

It is really hard to see how this 'it's just a lot of good data' view applies to deep reinforcement learning where the model learns multi step policies from raw input data (e.g a camera on a robot) with only a rough high level reward function to guide it.

If therefore (as seems to be the case) you can abstract the information humans need to provide to the model/learning system to ever high levels of reward function (and thereby vastly reduce the information provided by humans) then it seems very hard to argue that the model (and the training process) isn't doing to some degree what you describe as:

'incredible amounts of experimental work to carve-the-world along its joints, ie., to have the right concepts; and incredible amounts of work to measure along its joints, ie., to have the right units. And then to eliminate all the coincidences and irrelevances.'

For example, imagine a robot learning from scratch to pick objects up based on raw pixel data with only a scalar reward function - where in this process is the human preparing the data so the model only has to average?

1 comments

mjburgess 1528 days ago

> For example, imagine a robot learning from scratch to pick objects up based on raw pixel data with only a scalar reward function - where in this process is the human preparing the data so the model only has to average?

Great -- so do you have an example of such a system?

I'd be inclined, initially, to deny that it exists. If your reward function expresses a reward for the goal of "picking up objects with (pixel-space) properties etc.", you're cheating. In this case, the reward function serves the role of the data: ie., prepared by us to work. Indeed, a function is just a dataset -- and the reward function here is being sampled by the system.

You'd need to show me a system whose reward function / dataset didnt "contain the solution", in the manner of animals who respond to the world without already having all the information about it.

The relevant capacity a system needs to have, in both cases, is being able to take a profoundly ambiguous environment and produce a dataset/reward-fn which "carves along its joints". Ie., which effectively eliminates that ambiguity.

When such ambiguity & coincidence is eliminated, there's basically nothing left to do -- it's that basic nothing which we task machines with doing. Ie. running `mean(sample(unambigious relevant well-carved data))`.

You'll note its the *properties* of the data which express intelligence & learning.

link

xfs 1528 days ago

Plenty of RL systems learn to play video games just fine without fine-tuned rewards, but I see this line of thought isn't actually what you're getting at.

I would assume serious ML people would not be overly ambitious and overstep their claims beyond empirical realms. You were saying ML "uncovers latent representational structure not present in the data", but I would guess the claim, if that is what you're going against, is merely that the latent structures exist, and no Truth is really "uncovered" by ML per se, in the Heideggerian sense.

I agree ML hasn't really produced an Understanding of the world. The carving along the joints is in other words a symbolic abstraction of the world that is a radical simplification, for which only Reason is capable of, and ML hasn't shown to be capable of Reason. As an aside, I also would not assume the ambiguity you refer to can be fully eliminated even by human intelligence, just see how languages are fully of ambiguity, or even quantum mechanics.

But again, when philosophical critiques are launched against ML, the usual story is ML advocates would retreat to the success of ML in the empirical realms. I'm reminded of the Norvig vs Chomsky debate by this.

link

mjburgess 1528 days ago

I think this debate has historically suffered from being conducted purely philosophically. Hediegger, Dreyfus (Ponty et al.) needed a bit more science and mathematics to see through the show.

All we need to do to make the Heideggerian point is ask the RL researcher what his reward function is. Have him right it out, and note, that its a disjunction of properties which already carve the environment of the robot.

In otherwords, the failure of AI is far less of a mystery than philosophy alone seems to imply. Its a failure in a very very simple sense if one just asks the right technical questions.

For RL, all we need ask is, "what will the machine do when it encounters an object outside of your pregiven disjunction?"

The answer, of course, is fall over.

Hardly what we fear when the wolf learns our movements, or what we love when a person shows us how to play a piano for the first time. The very thing we want, and we are told we have, isnt there... and it's not "not there" philosophically... its not there in the actual journal paper.

link

xfs 1528 days ago

The Heideggerian point is a start, but I don't think it's enough to just point out a failure like this. This allegation is something like "The answer is already encoded in the question" like of trick, similar to one played in Foucault's episteme, where science itself is always-already a social construction without which it is impossible to happen.

The trick is challenging on first sight but it won't go very far, because it just tells us what ML lacks but doesn't tell us what ML can have and how to go there. We need a new kind of Turing test that actually reflects the power of human intellect.

link

mjburgess 1528 days ago

I suspect even thinking there's a "test" has it wrong.

Yes, there's an experimental test -- as in, testing to see if salt is salt. But I dont think there's a formal test... as soon as you specify it, you've eliminated the need for intelligence. Intelligence is in that process of specification.

In otherwords, we should be able to ask the machine "what do you think of woody allen's films?" and rather than just taking any answer.. we need an empirical test to see if the machine has actually understood the question. Not a formal test.

There is no doubt a sequence of replies which will convince a person that the machine has understood the question: just record them, and play them back.

We're not interested in the replies. We're interested in whether the machine is actually thinking about the world. Is it evaluating the films? What are its views? What if I show it a bad film and say it wasnt by woody allen? What then?

There's something wrong in seeing this as a formal, rather than experimental, process. For any given machine we will need specific hypothesis tests as to its "intelligence", and we will need to treat it like any other empirical system.

link

xfs 1528 days ago

OK, maybe "Turing test" was a bad hint because too often its extension turns into a philosophical rabbit hole of defining intelligence.

I want to get back to your initial statement about uncovering and structures, which I think is still grounded in the empirical realm. I think a less ambitious new test could be about the "uncovering" between analog data and the structures. To be real uncovering, the structures must be symbolic, not just transformed analog representation, and the symbolic structures must be useful, e.g. provide radical reduction of computational complexity compared to equivalent computation with analog data.

The point is to test if the machine can make the right abstraction (real uncovering) and also connect the abstraction with the data, not just games with words.

link

mannykannot 1528 days ago

> For RL, all we need ask is, "what will the machine do when it encounters an object outside of your pregiven disjunction?"

> The answer, of course, is fall over.

There is no reason to think humans are qualitatively different in this regard, it is just that it does not happen very often. One case where it does is that humans pilots, no matter how competent, are incapable of flying without external visual references or instrument proxies for them.

link

mistrial9 1528 days ago

> to play video games

If I am following here, a key part of this argument is that models only represent things "in bounds" of the model, and that unsupervised, iterative approaches are especially susceptible to this. Video games are enormously constrained, artifical model environments, and therefore by definition are completely discoverable.

Meanwhile, human cognition and the real actual world, have vast and subtle detail, and also are not completely knowable at any level minus some physics or similar. Tons of possible data sets are not necessarily discoverable or constrained, yet humans can investigate and draw conclusions, sometimes in very non-obvious ways.

Falling back to pure philosophy, personally I am heavily on the side of the human, and in the wake of Kurt Gödel, believe that plenty of formal systems are never complete, nor can they be shown to be complete or incomplete.

link

jah242 1528 days ago

This would be one example from Deepmind using raw pixel input to stack objects. This has a relatively detailed reward function (but is also a very complicated task) - https://arxiv.org/abs/2110.06192

There are other examples from OpenAI a while back using even just sparse rewards (i.e binary 1, 0 for success or failure over the whole task) - but these weren't pixel input if I remember correctly - https://openai.com/blog/ingredients-for-robotics-research/

I m afraid if you think providing any reward function is cheating then we have fundamentally different views of what AI/ML even means/involves. It appears humans and likely all animals have largely pre-programmed reward functions developed over billions of years of evolution (pain is bad, food is good, etc.). These reward functions are ultimately what underpin what we are trying to do, what outcomes are good/bad, to what degree we 'want' to explore vs exploit. The idea that human and animal 'intelligence' is born as a blank slate with nothing to guide it and no reward function to maximise doesn't seem to bear any resemblance to reality.

The only difference between a reward function that tells a robot 'you need to stack these objects but I m not going to tell you where in 3D space the objects are or where they need to go to stacked or the shapes/forces involved' and an animal that is born with a reward function that says 'you need to find food and shelter but I m not going to tell you how to collect the food or where to find shelter' is the level of abstraction. Fundamentally they appear the same.

You are pulling a sleight of hand when you suggest 'in the manner of animals who respond to the world without already having all the information about it' - there is a vast difference between an abstract reward function (which humans and animals also have) and 'having all the information about [the world]'.

link

moyix 1528 days ago

I realize it's a much more cleanly defined domain, but how does this interpretation explain MuZero?

link

sdenton4 1528 days ago

AlphaGo comes to mind...

link