| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jmalicki 110 days ago
	I think the point of the paper is to prod benchmark authors to at least try to make them a little more secure and hard to hack... Especially as AI is getting smart enough to unintentionally hack the evaluation environments itself, when that is not the authors intent.