Hacker News new | ask | show | jobs
by mjburgess 1095 days ago
I think this is easy, just make Xp sentences of the kind = "I define `randomchars()` to be this `term-in-Xc()`" and swamp the dataset with Xc.

Everything here actually just follows formally from what NNs are: they're just empirical function approximations.

It will always be the case that they just model the probabilistic structure of the dataset and not the data generating process.

Since, in language, there are discrete constraints which make P(...) = 1 or P(...) = 0 --- you can trivially produce datasets showing that it learns P(...) = mistake-you-created-deliberately and not either 0,1.

As above, the LLM switches from 95% confidence "chocolate" to 95% confidence "popcorn" with a trivial non-semantic permutation of the prompt.

The obscene issue in all this is that we know this already -- empirical function approximation of historical datasets just produces associative probabilistic models of those datasets.

1 comments

> I think this is easy, just make Xp sentences of the kind = "I define `randomchars()` to be this `term-in-Xc()`" and

`randomchars()` does not match your own requirement `but not that the tokens of Xp are themselves rare` and therefore is unsuitable.

good point --- so replace it with a `sample()` fn that selects from an appropriate distribution over the data
Now you have a strong statistical dependency between Xc and Xp the lack of which was required for your proof to show that the algorithm is unable to learn Xp. BTW it was already there because you already had `term-in-Xc()`.