| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by epolanski 321 days ago
	How does running it multiple times performs? LLMs are non-deterministic, I think benchmarks should be more about averages of N runs, rather than single shot experiments.