| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by howlin 2005 days ago
	One thing to keep in mind about direct (learn the policy/behavior) versus indirect (learn the model and then simulate behaviors on the model to choose the best) is that sometimes it's much easier to find a good enough policy than it is to learn an accurate enough model for simulation. Driving is a good example of this. Most of the time all you need to do is stay in your lane and obey the rules for intersections. A simulation of a driving environment, on the other hand, is quite difficult.