| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by DennisP 921 days ago
	> prioritize cumulative long-term feedback over immediate feedback and can adapt to environments with limited observability, sparse feedback, and high stochasticity Sounds like something that could learn to play decent poker.