Hacker News new | ask | show | jobs
by gog-ma-gog 2342 days ago
The subtlety here is that NNs do have a model, but it’s hard to see. Not just any neural network can perform as well as GPT-2–a very specific architecture can. That architecture, coupled with the data it’s trained on, implicitly represents a model, but it’s wildly obscured by the details of the architecture.

In this sense, people like Sutskever think that GPT-2 is a step on the path towards discovering the “correct” model.

It’s probably difficult to make much more progress without making extremely crisp by what you mean a “model” is, though, because I feel like it’s just as easy to move goal posts about what it means to “understand” as it does to “model”.

For example, replace every instance of “a model” in your post with “an understanding”, and it parses almost identically

1 comments

I don’t understand your last point, but the point about it being hard to be clear about what a model means is exactly right. But it’s not because it’s not clear what a model is, but rather because it’s not clear what the modeling language of thought is. Here’s where the algebra analogy breaks down. Pretty obviously, the model or models that we are reasoning with in this discussion aren’t simple algebraic equations, but some sort of rich representations of cognitive science and computer science concepts. And, sure, there are NNs running those models, and NNs running the reasoning over them, but they have almost nothing to do with language in the sense of the syntax of sentences. Also, we didn’t get trained with eleventy zillion examples of AI discussions in order to form the models we are employing at this very moment.