| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by anon84873628 51 days ago
	Right. Everyone is using this to judge the LLMs instead of questioning what situation they were actually fed and whether it was in fact the best move. More likely, the simulation was just very poor and the results are nonsense.