| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by aero142 504 days ago
	Are there any successful models that weren't trained with RLHF, or using a system with RLHF. I'm curious if this could be done without a fine tune step that would't meaningfully bias this.