| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by drivenextfunc 392 days ago
	Regarding the stubborn and narcissistic personality of LLMs (especially reasoning models), I suspect that attempts to make them jailbreak-resistant might be a factor. To prevent users from gaslighting the LLM, trainers might have inadvertently made the LLMs prone to gaslighting users.