|
|
|
|
|
by dahart
3191 days ago
|
|
It's super interesting to think that any non-linearity at all can make it work. This particular non-linearity is surprising since it's clamping to zero at the center of the response curve. I'd have thought that's right where you want the linear response, and that clamping in the middle would cause bad things to happen. Sigmoid and RelU (and others) clamp at the foot/shoulder. Perhaps this network just learns negative weights, compared to the traditional activation functions?? |
|