|
|
|
|
|
by hackinthebochs
1164 days ago
|
|
Let me alter your example a bit: we have P(A|B), we want P(A|B,B->A). But given enough examples of the form P(A|B), a good algorithm can deduce B->A and use it going forward to predict A. How? By searching over the space of explanatory models to find the model that helps to predict P(A|B) in the right cases and not in the wrong cases. LLMs do this with self-attention, by taking every pair of symbols in the context window and testing whether each pair is useful in determining the next token. As the attention matrix converges, the model can leverage the presence of "Raining & Outside" in predicting "ShoesWet". Of course, this is a rather poor excuse for an explanation. The fact that "outside" and "raining" are close doesn't explain why "my shoes are wet". But it does get us closer to a genuine explanation in the sense that it eliminates a class of wrong possibilities from consideration: every sentence that doesn't have outside in proximity to raining downranks the generation "my shoes are wet". The model is further improved by adding more inductive relationships of this sort. For example, the presence of an expanded umbrella downranks ShoesWet, the presence of "stepped in puddle" upranks it. Construct about a billion of these kinds of inductive relationships, and you end up with something analogous to an explanatory model. The structural relationships encoded in the many attention matrices in modern LLMs in aggregate entail the explanatory relationships needed for causal modelling. |
|
But the machine doesn't know which are the right cases. We aren't presuming there's a column, Z = 1 for B -> A, and Z = 0 otherwise -- right?
The machine has no mechanism to distinguish these cases.
> testing whether each pair is useful in determining the next token
This isnt causation.
> every sentence that doesn't have outside in proximity to raining downranks the generation
So long as the sequential structure of sentences corresponds to the causal structure of the world: but that's kinda insane right?
We haven't rigged human language so that the distribution of tokens is the causal structure of the world. The reason text generated by LLMs appears meaningful is because we understand it. The actual structure of text generated isnt "via" a model of the world.
(Consider, for example, training an LLM on a dead untranslated language -- it's output is incomprehensible, and its weights are abitarily correlated with anything we care to choose.)
Nevertheless, given our choice of token, you do have a model which says:
That's true. But we're choosing these additional conjunctions because we already know the causal model; these conjunctions are how we're eliminating confounders to get an approximation close to the actual.(Which you'll never get, the actual value is `1`. Iff A -> B, then P(A|B->A) = 1 -- this is a deductive inference necessary for ordinary science to take place).
In any case, P(A | B -> A) means without any confounders. To actually find the LLM's approximation of this we'd need to compute:
And then find P(A|B & C') st. C' made P(A|B) maximally likely.If you find a set of {C} st. P(A|B) has a high probability, you won't find causal conditions.
All that statistical association models here is, at best, salience -- not causal relevance.