| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sfink 113 days ago
	My guess? Require them to not do the reinforcement learning on a custom model that implements guardrails. I think Anthropic has some of this built in already and couldn't alter it without retraining, but there's tons more layered on top.