|
|
|
|
|
by shawntan
766 days ago
|
|
There's a general trap people working on deep learning tend to fall into, thinking "Why don't we learn the activation function as well?" The answer to that really should be that a combination of linear and non-linear activations can learn you the non-linearities you need.
https://twitter.com/bozavlado/status/1787376558484709691 Though there are other types of functions that these "universally approximate" formulations don't extrapolate well to, and solutions to that might actually be an improvement. (think: sin,cos) |
|