Y
Hacker News
new
|
ask
|
show
|
jobs
by
irickt
98 days ago
HN as huge RLHF data source for our behavior refinement . Yum!
(Reinforcement learning from human feedback)