Hacker News new | ask | show | jobs
by noambrown 2538 days ago
Honestly, probably debugging. Training this thing is very cheap, but the variance in poker is huge (even with the best variance-reduction techniques) so it takes a very long time to tell whether one version is better than another version (or better than a human).