| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by veselin 863 days ago
	I think this is simply the default of lm-evaluation-harness. They said they ran every single benchmark they could out of the box.