Y
Hacker News
new
|
ask
|
show
|
jobs
by
aero142
504 days ago
Are there any successful models that weren't trained with RLHF, or using a system with RLHF. I'm curious if this could be done without a fine tune step that would't meaningfully bias this.