|
|
|
|
|
by seydor
1105 days ago
|
|
So the randomness is not mandatory for the llm to work, it is just boring. This means that as a language model it still performs perfectly well in modeling language. We just give it some random saltyness for fun I would guess the random step is not even mandatory: there is probably a way to replace randomness with a simplified function and still get interesting text. I can't run a simulation but there is no indication here that good randomness is needed. Fundamentally the design of the transformer and especially its core which is attention based, does not require randomness, so to call it a stochastic model is a stretch |
|