| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by shawntan 813 days ago

There's a general trap people working on deep learning tend to fall into, thinking "Why don't we learn the activation function as well?"

The answer to that really should be that a combination of linear and non-linear activations can learn you the non-linearities you need. https://twitter.com/bozavlado/status/1787376558484709691

Though there are other types of functions that these "universally approximate" formulations don't extrapolate well to, and solutions to that might actually be an improvement. (think: sin,cos)