Hacker News new | ask | show | jobs
by dnautics 2195 days ago
That's really cool and I didn't think of that. I just wanted clarification: that means you train the agent without the deterministic solution and your "validation/test" (I'm not sure what those phases are called in unsupervised learning) sets are done without the deterministic solution.
1 comments

Yes, the agent is trained without access to the deterministic solution.