|
|
|
|
|
by colechristensen
214 days ago
|
|
Outputs not being deterministic with temperature = 0 doesn't match my understanding of what "temperature" meant, I thought the definition of T=0 was determinism. Is this perhaps inference implementation details somehow introducing randomness? |
|
https://news.ycombinator.com/item?id=45200925
https://thinkingmachines.ai/blog/defeating-nondeterminism-in...
> As it turns out, our request’s output does depend on the parallel user requests. Not because we’re somehow leaking information across batches — instead, it’s because our forward pass lacks “batch invariance”, causing our request’s output to depend on the batch size of our forward pass.
tl;dr: the way inference is batched introduces non-determinism.