| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by abrichr 553 days ago
	https://arxiv.org/abs/2412.04984 > Our findings demonstrate that frontier models now possess capabilities for basic in-context scheming [covertly pursuing misaligned goals], making the potential of AI agents to engage in scheming behavior a concrete rather than theoretical concern.