Hacker News new | ask | show | jobs
by sockmeistr 3223 days ago
It doesn't understand the value; from the article: "We also separately trained the initial creep block using traditional RL techniques".