|
|
|
|
|
by charcircuit
1036 days ago
|
|
>Our ability to divine the purpose of activations of anything but the extremely small scale is atrocious. The value of each parameter is chosen to minimize the loss. This applies to every single weight of the model. Not all weighs affect loss the same amount which is why concepts like pruning exist. |
|
Vague and fairly useless. What is it doing to minimize loss ?
>Not all weighs affect loss the same amount which is why concepts like pruning exist.
Only weights with values close to or at zero get pruned. It's not because we know what each weight does and can tell what would work otherwise.