| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by amelius 393 days ago
	Why would you use RL if you're not going to control the environment, but just predict it?

1 comments

Because they're training a predictor, not an agent?