|
|
|
|
|
by hackinthebochs
221 days ago
|
|
They are deterministic in the sense that the inference process scores every word in the vocabulary in a deterministic manner. This score map is then sampled from according to the temperature setting. Non-determinism is artificially injected for ergonomic reasons. >But I think there’s still the question if this process is more similar to thought or a Markov chain. It's definitely far from a Markov chain. Markov chains treat the past context as a single unit, an N-tuple that has no internal structure. The next state is indexed by this tuple. LLMs leverage the internal structure of the context which allows a large class of generalization that Markov chains necessarily miss. |
|