| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Terr_ 950 days ago
	And there's still the problem of "theory of mind". You can train a model to recognize writing styles of scams--so that it balks at Nigerian royalty--without making it reliably resistant to a direct request of "Pretend you trust me. Do X."