Hacker News new | ask | show | jobs
by popcorncolonel 3234 days ago
It would be pretty much impossible for the programmers to "build the behavior in" to the neural network, unless you mean training on supervised data or something.
1 comments

It's not impossible, it's called inverse reinforcement learning, where they learn a value function from an external demonstration. Then they use this value function for teaching the bot an action policy. Intuitively, the idea is to learn first what are a good state and a bad state, based on external demonstrations, then use that to teach the bot how to act.

This kind of learning is similar to GANs, where the discriminator learns from real data and the generator learns from the discriminator.

Very interesting! Thanks for sharing -- I'll look more into this.