Hacker News new | ask | show | jobs
by wnoise 3044 days ago
Of course they can. DRL is a very very specific set of techniques to train decision-making over multiple timesteps.