|
|
|
|
|
by hackinthebochs
1167 days ago
|
|
>We haven't rigged human language so that the distribution of tokens is the causal structure of the world [...] The actual structure of text generated isnt "via" a model of the world. This is an odd claim. I certainly say that I picked my cup off the floor rather than I picked my cup off the ceiling because gravity causes things to fall down rather than up. Human language isn't "rigged" to represent the causal structure of the world, but it does nonetheless. The distribution of tokens is such that the occurrence of (A,B) and (B,A) are asymmetric, and this is precisely because of features of the world influence the distribution of words we use. A sufficiently strong model should be able to recover a model of this causal structure given enough training data. >That's true. But we're choosing these additional conjunctions because we already know the causal model; these conjunctions are how we're eliminating confounders to get an approximation close to the actual. But these patterns are represented in the training data by the words we use to discuss raining and wet shoes. There is every reason to think a strong model will recover this regularity. >All that statistical association models here is, at best, salience -- not causal relevance. That's all we can ever get from sensory experience. We infer causation because it is more explanatory than accepting a huge network of asymmetric correlations as brute. YeGoblynQueenne is right that my point is basically a version of the problem of induction. We can infer causation but we are never witness to causation. We do not build causal models, we build models of asymmetric correlations and infer causation from the success of our models. What a good statistical model does is not different in kind. |
|
When I act on the world, with my body, I take as a given "Body -> Action". We witness causation in our every action.
> This is an odd claim
The tokens can be given any meaning. The statistical distribution of token frequencies in our languages have an infinite number of causal semantics which are consistent with them.
We can find arbitary patterns such that
Only those we give a semantics to ("Rain" = Rain), and only those we already know are causal we will count. This is the trick of humans reading the output of LLMs -- this is what makes it possible. It's essentially one big Eliza effect.No, the structure of language isnt the structure of the world.
This pattern in tokens,
Is an associative statistical model of conditional aggregate salience between token terms.Phrase any such conditional probability you wish, it will never select for causal patterns.
this is why we experiment. It's why we act on the world to change it.
When the child burns their hand on the fireplace they do so once. Why?
Because the child immediately infers,
How? via the abduction, roughly: how? how? etc.In other words, we bottom out our reasoning in a
Absent this, absent being in the world with a body, you cannot determine causes.The problem of induction phrased in modern language is this: statistics isn't informative. Or, conditional probabilities are no route to knowledge. Or, AI is dumb.