| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by qwert7890 3136 days ago
	Simplest RL algorithm (Q-learning) achieves 100m in QWOP: https://www.youtube.com/watch?v=e27TUmMkOA0 Although it found and exploited a local maximum of "knee scraping" technique (which humans can replicate) :)