Y
Hacker News
new
|
ask
|
show
|
jobs
by
justusm
338 days ago
nice! Training models using reward signals for code correctness is obviously very common; I'm very curious to see how good things can get using a reward signal obtained from visual feedback
1 comments
grace77
338 days ago
As are we, seems like the natural next step
link