| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by locusofself 150 days ago
	I feel like the only solution to the problem is democratized RLHF, where whenever we get a bad answer from an LLM, we can immediately tell it what was wrong and it can learn from that.

2 comments

If you're paying to use the model that means instead of paying content creators you're also now giving more content to the model for free.

Also just like SEO to game search engines, "democratized RLHF" has big trust issues.

Maybe what is bad for you would be right for me.