| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by HarHarVeryFunny 529 days ago

The author of the article seems confused, saying:

"The important thing to remember is that the output token of the LLM (black box) is not deterministic. Rather, it is a probability distribution over all the available tokens in the vocabulary."

He is saying that there is non-determinism in the output of the LLM (i.e. in these probability distributions), when in fact the randomness only comes from choosing to use a random number generator to sample from this output.

1 comments

fancyfredbot 529 days ago

The author is saying that the output token is not deterministic. I don't think they said the distribution was stochastic.

Even so the distribution of the second token output by the model would be stochastic (unless you condition on the first token). So in that sense there may also be a stochastic probability distribution.

hansvm 529 days ago

Mostly unrelated (I agree with you, and I'm some ancestory comment you're responding to with the same line of thinking), I have built a couple LLMs where the distribution itself is stochastic. That's not key to how they work as a black box, but much like how quicksort has certain performance characteristics I did find it advantageous to introduce randomness into the model itself.

You could still easily model the next token as a conditional probability distribution though if you wanted; the computation of entropy just might be a bit spendier.