Hacker News new | ask | show | jobs
by pkdpic 1157 days ago
I love this, but Im always confused in these kinds of analogies what the reward / punishment system really equates to...

Also reminds me of Ted Chiang warning us that we will torture innumerable AI entities long before we start having real conversations about treating them with compassion.

2 comments

Don't love it, it's not correct.

> what the reward / punishment system really equates to

Nothing, and least as far as neural network training goes. This is an extremely poor analogy regarding how neural networks learn.

If you've ever done any kind of physical training and have had a trainer sightly adjust the position of your limbs until what ever activity you're doing feels better, that's a much closer analogy. You're gently searching the space of possible correct positions, guided by an algorithm (your trainer) that knows how to move you towards a more correct solution.

There's nothing analogous to a "reward" or "punishment" when neural networks are learning.

>There's nothing analogous to a "reward" or "punishment" when neural networks are learning.

Well deep reinforcement learning.

Yeah but even in that case, "reward" is just the thing a NN is trying to predict. The NN itself is not receiving the reward (or any punishment). Instead, it's following gradient signals to improve that estimate of reward, which is then used as a proxy for an optimal policy decision.
> what the reward / punishment system really equates to

Well, in the article, it says the punishment was a slap. On the other hand, he just says "she gives you a wonderful reward"... so you're left to use your imagination there.