| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by caeril 583 days ago
	There's still a ton of room left in short error-correcting RLHF or fine-tune cycles. I expect there will be, at some point, "trusted" or "verified" accounts that are able to flag a completion as wrong, and provide it with a correct completion, which will be collated and used (daily? hourly?) to fine-tune the weights. Active inference cluster would be switched as A/B/C etc.