|
|
|
|
|
by Cr8
157 days ago
|
|
unfortunately disabling temperature / switching to greedy sampling doesn't necessarily make most LLM inference engines _fully_ deterministic as parallelism and batching can result in floating point error accumulating differently from run to run - it's possible to make them deterministic but does come with a perf hit some providers _do_ let you set the temperature, including to "zero", but most will not take the perf hit to offer true determinism |
|