| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by blululu 353 days ago
	On principle no it is not surprising given the points you mention. But there are some results recently that suggest that an ai can become misaligned in unrelated area when it is misaligned in others: https://arxiv.org/abs/2502.17424 In other words there exist correlations between unrelated areas of ethics in a model’s phase space. Agreed that we don’t really understand llm’s that well.