Hacker News new | ask | show | jobs
by dontreact 3238 days ago
They can always finetune using RL later. Superversied training was the first step at making AlphaGo work.