| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jondwillis 350 days ago
	We’re already steering, during pre-training (e.g. reasoning RLHF), as well as test-time (structured outputs, tool calls, agents…)