| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by justusm 338 days ago
	nice! Training models using reward signals for code correctness is obviously very common; I'm very curious to see how good things can get using a reward signal obtained from visual feedback

1 comments

As are we, seems like the natural next step