| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Cr8 204 days ago
	unfortunately disabling temperature / switching to greedy sampling doesn't necessarily make most LLM inference engines _fully_ deterministic as parallelism and batching can result in floating point error accumulating differently from run to run - it's possible to make them deterministic but does come with a perf hit some providers _do_ let you set the temperature, including to "zero", but most will not take the perf hit to offer true determinism