Y
Hacker News
new
|
ask
|
show
|
jobs
by
satvikpendem
26 days ago
RL with the harness inputs and outputs of users is one of the primary improvers of model performance, a self perpetuating flywheel.