Hacker News new | ask | show | jobs
by foo12bar 27 days ago
Because the sessions would last forever. Think of a 1 or 2 length snake, figuring out that left down up right over and over again doesn't lose any points. You're now trapped in a local minimum. You need to make the AI get impatient (lose points) or it'll never learn.
1 comments

I see what you are saying but then wouldn’t it miss out on the best strategies, which do require patience and not going straight for the apple?
Maybe you could make it lose points for repeating a board state, I guess.
Ah yes that’s probably the right approach