|
|
|
|
|
by nicklecompte
782 days ago
|
|
It's because floating-point arithmetic isn't deterministic, which becomes salient when (speaking loosely) the difference between likelihood of two different tokens is less than the precision of the FPU. I am not sure to what extent this effect has been quantified. |
|