Hacker News new | ask | show | jobs
by ziofill 469 days ago
Is anyone aware of a formalization of the idea that to get “symbols” out of fuzzy probability distributions one needs distributions whose value goes exactly to zero over some regions of the domain? I.e. Gaussian mixtures won’t cut it. And they will need very high Fourier frequencies.

I have the gut feeling that until a model allows for a small probability that 2x3 is 7, there will always be hallucinations. Probabilities need to be clamped to zero to emulate symbolic behaviour.

2 comments

Symbolic behavior is artificial and not how humans think either. 0 is not a probability (neither is 1) - a value of 0 or 1 basically breaks calculations by dragging everything along to the limit, the same way infinity does, or 0 in the denominator (in fact, that's what 1 and 0 translate to if you switch to logprobs or other equivalent ways to calculate probabilities).

Consider: if you clamp the probability distribution of answers to 2x3, so that it's 0 everywhere else and 1 at 6, you're basically saying that it is fundamentally impossible for you to misunderstand the question, or make mistake in the answer, or that you're dreaming, or hallucinating, or that you've momentarily forgotten that the question was preceded by "In base 4, what is ", or any number of other things that absolutely are possible, even if highly unlikely, in the real world.

We do clamp probabilities to zero. Look into top-p sampling or nucleus sampling.