Hacker News new | ask | show | jobs
by katzenversteher 468 days ago
I can only help with RL, that's probably reinforcement learning. As far as I remember that means you let the model perform a task that can be "graded" and then depending on how well it did it get's a reward it want's to maximize. I believe (this it where I'm very insecure, I could be wrong) the neurons (weights / biases) of the neurons that where involved in reaching the highest reward get adjusted to have a bigger influece.