Hacker News new | ask | show | jobs
by AndrewKemendo 1112 days ago
Less of a scale problem than a type problem usually in my experience.

My rule of thumb is when it’s easy to specify a reward function but infinite ways to traverse the action space - versus having a constrained state and action space (small n solution traversal pathways) and only a few possible paths to traverse.