| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cweill 1265 days ago
	I also have this question. Is the RL MDP actually encoding cause and effect? Or just learning (bidirectional) correlations between states and actions? I wonder if Pearl thinks that RL replicates his do-calculus under the hood, or if that's an innovation we're missing.