Hacker News new | ask | show | jobs
by zhanwei 3635 days ago
UCT is a very simple idea that works surprisingly well across very diverse domains. Can't emphasize its generality enough, you can throw different problems at it and it can give you decent (may not be the best) result without using any additional domain knowledge.

It is also very easy to work with, you can easily tweak the algorithm and add heuristics for your specific domains.

Also relevant:

UCT applied to partially observable game (Poc-man) http://papers.nips.cc/paper/4031-monte-carlo-planning-in-lar...

Another approach for Monte-Carlo planning http://papers.nips.cc/paper/5189-despot-online-pomdp-plannin...

1 comments

I've coded and tested sparse lookahead in a trading algorithm before using a comprehensive example as a guide. Do you know of any comprehensive walkthrough examples implementing a UCT scenario that I can use to implement and verify my results?
This tutorial seems quite good. Covers the basic and various useful UCT extension: https://webdocs.cs.ualberta.ca/~mmueller/courses/2014-AAAI-g...

However, the tutorial doesn't work towards a working implementation. I think you can verify your results against benchmark problems. There are a number of good implementations around:

UCT implementation with many MDP benchmark problems: https://github.com/bonetblai/mdp-engine

My favorite implementation. The code is quite easy to read: http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Applications_fil...

Thank you. I like to verify as much as I like to implement :-)