Hacker News new | ask | show | jobs
by gog-ma-gog 2339 days ago
https://nostalgebraist.tumblr.com/post/189965935059/human-ps... for an orthogonal point of view—-I feel Marcus is a bit too embroiled in this particular debate to make level-headed criticism on the merits/potential of GPT-2
1 comments

To elaborate a bit: people like Marcus tend to overload/move the goal posts with what the word “understand” means. I kinda feel like in a world where we have perfectly conversational chat bots that are capable of AI complete tasks—-that if these bots look like Chinese rooms under the hood, he’ll still be complaining that they don’t “understand” anything.

I don’t think it’s unreasonable to say that if you think something that doesn’t “understand” anything can do what GPT-2 can do, then maybe your definition of “understand” doesn’t cut reality at the joints

Understanding is not hard to understand. To understand is to reason from a model. Reasoning from a model is easy. Discovering the correct model is hard, analogous to the way that algebraic rules are easy, but finding the right equation for a particular problem is hard. Data trained NNs have neither a model, nor do they reason. QED
You could say that a trained neural net contains a model of how language works, and it reasons about sentences based on this model.

I think people are really hung up on that it has trouble reasoning about what its sentences are reasoning about, and skipping how amazing it is at reasoning about sentence structure itself.

Yes, but people don’t reason about language, they just do it. I know you think I’m confused about this but I’m not. I mean reason here quite explicitly because what we’re talking about is understanding. No one thinks that they ... uh, well ... “understand” language ... okay, we need a new word here because “understand” has two different meanings here. Let’s use “perform” for when you make correct choices from an inexplicit model, that’s what the NN does, and hold “understand” for what a linguist (maybe) does per language, and what a physicist does per orbital mechanics. What we are hoping a GAI will do is the latter. Any old animal can perform. Only humans, as far as we know, and perhaps a few others in relatively barrow cases, understand in the sense that a physicist understands OM. No NN trained on language is gonna have the present argument. Ever.
The subtlety here is that NNs do have a model, but it’s hard to see. Not just any neural network can perform as well as GPT-2–a very specific architecture can. That architecture, coupled with the data it’s trained on, implicitly represents a model, but it’s wildly obscured by the details of the architecture.

In this sense, people like Sutskever think that GPT-2 is a step on the path towards discovering the “correct” model.

It’s probably difficult to make much more progress without making extremely crisp by what you mean a “model” is, though, because I feel like it’s just as easy to move goal posts about what it means to “understand” as it does to “model”.

For example, replace every instance of “a model” in your post with “an understanding”, and it parses almost identically

I don’t understand your last point, but the point about it being hard to be clear about what a model means is exactly right. But it’s not because it’s not clear what a model is, but rather because it’s not clear what the modeling language of thought is. Here’s where the algebra analogy breaks down. Pretty obviously, the model or models that we are reasoning with in this discussion aren’t simple algebraic equations, but some sort of rich representations of cognitive science and computer science concepts. And, sure, there are NNs running those models, and NNs running the reasoning over them, but they have almost nothing to do with language in the sense of the syntax of sentences. Also, we didn’t get trained with eleventy zillion examples of AI discussions in order to form the models we are employing at this very moment.