| >A statistical model may "capture" explanatory relations, but can it use them? That's the beauty of autoregressive training, the model is rewarded for capturing and utilizing explanatory relations because they have an outsized effect on prediction. It's the difference between frequency counting while taking the past context as an opaque unit vs decomposing the past context and leveraging relevant tokens for generation while ignoring irrelevant ones. Self-attention does this by searching over all pairs of tokens in the context window for relevant associations. Induction heads[1] are a fully worked out example of this and help explain in-context learning in LLMs. >Anyway I find it very hard to think of language models as explanatory models. They're predictive models, they are black boxes, they model language, but what do they explain? And to whom? The model encodes explanatory relationships of phenomena in the world and it uses these relationships to successfully generalize its generation out-of-distribution. Basically, these models genuinely understand some things about the world. LLMs exhibit linguistic competence as it engages with subject matter to accurately respond to unseen variations in prompts of that subject matter. At least in some cases. I argue this point in some detail here[2]. >how can we say which model we got a hold of? More sophisticated tests, ideally that can isolate exactly what was in the training data in comparison to what was generated. I think the example of the wide variety of poetry these models generate should strongly raise one's credence that they capture a sufficiently accurate model of poetry. I go into detail on this example in the link I mentioned. Aside from that, various ways of testing in-context learning can do a lot of work here[3]. [1] https://transformer-circuits.pub/2022/in-context-learning-an... [2] https://www.reddit.com/r/naturalism/comments/1236vzf/ [3] https://twitter.com/leopoldasch/status/1638848881558704129 |
That sentence should be decorated with the word "allegedly", or perhaps "conjecture"! In practical terms, I believe you are pointing out that language models of the GPT family are trained on a context surrounding, not just preceding, a predicted token. That's right (and it gets fudged in discussions about predicting the next token in a sequence), but we could already do that with skip-gram models, and with context-sensitive grammars, and dependency grammars, many years ago, and I don't remember anyone saying those were specially capable of capturing explanatory relations [1]. Although for grammars the claim could be made, since they are generally based on explanatory models of human language (but not because of context-sensitivity).
Anyway, I thought you were arguing that explanations are arbitrary, "explanatory posits", and wouldn't that mean that an explanation doesn't improve prediction? This is not to catch you in contradiction, I'm genuinely unsure about this myself. My understanding is that explanatory hypotheses improve predictions in the long run [2], but that's not to say that a predictive model will improve given explanations, rather explanatory models eventually replace strictly predictive models.
Are you saying that including explanations in training data can improve prediction? That would make sense, but this is very hard to do when training a predictive model on text. In that case, the explanations are at best hidden variables and language models are just not the right kind of model to model hidden variables.
Sorry, writing too much today. And I got work to do. So I won't bitch about "in-context learning" (what we used to call sampling from a model back in the day, three years ago before the GPT-3 paper :).
______________
[1] My Master's thesis was a bunch of language models trained on Howard Philips Lovecraft's complete works, and separately on a corpus of Magic: the Gathering cards. One of those models was a probabilistic Context-Free Grammar, and despite its context-freedom, and because it was a Definite Clause Grammar, I could sample from it with input strings like "The X in the darkness with the Y in the Z of the S" and it would dutifully fill-in the blanks with tokens that maximised the probability of the sentence. So even my puny PCFG could represent bi-directional context, after a fashion. Yet I wouldn't ever accuse it of being explanatory. Although I would say it was quite mad, given the corpus.
[2] I mention in another comment my favourite example of the theory of epicylces compared to Kepler's laws of planetary motion.