Y
Hacker News
new
|
ask
|
show
|
jobs
by
catigula
198 days ago
It seems like you don’t understand reinforcement learning. The signal is reinforced because it correlates to behavior, hacking the signal itself is misalignment.