Y
Hacker News
new
|
ask
|
show
|
jobs
by
locusofself
150 days ago
I feel like the only solution to the problem is democratized RLHF, where whenever we get a bad answer from an LLM, we can immediately tell it what was wrong and it can learn from that.
2 comments
schmichael
150 days ago
If you're paying to use the model that means instead of paying content creators you're also now giving more content to the model for free.
Also just like SEO to game search engines, "democratized RLHF" has big trust issues.
link
randomNumber7
150 days ago
Maybe what is bad for you would be right for me.
link
Also just like SEO to game search engines, "democratized RLHF" has big trust issues.