|
|
|
|
|
by kotach
3751 days ago
|
|
LSTM would converge even faster. A K-level breadth first search mimicking the optimal policy and a simple learning to search algorithm with a cost sensitive binary linear classifier would work well too. After training it would be a constant time evaluation of what to do next. |
|