|
|
|
|
|
by orbital-decay
407 days ago
|
|
The raw output of a transformer model is a list of logits, confidence scores for each token in its vocabulary. It's only deterministic in this sense (same input = same scores). But it can easily assign equal scores to 1 and 0 and zero to other tokens, and you'll have to sample it randomly to produce the result. Whether you consider it external or internal doesn't matter, transformers are inherently probabilistic by design. Randomness is all they produce. And typically they aren't trained with the case of temperature 0 and greedy sampling in mind. |
|
The transformer is operating on the probability functions in a fully deterministic fashion, you might be missing the forest for the trees here. In your hypothetical, the transformer does not have a non-deterministic way of selecting the 1 or 0 token, so it will rely on a noise source which can. It does not produce any randomness at all.