Y
Hacker News
new
|
ask
|
show
|
jobs
by
djeastm
60 days ago
I thought reinforcement learning with human feedback was meant to get that quantification of "taste"