Hacker News new | ask | show | jobs
by uuwp 2620 days ago
Of course they can, because an NN can be any function. If we can pick tanh as the "activation" then we can as easily pick arctan as the activation ans say our NN computes arctan. What an achievement! A better question is whether conv+relu based NNs can approximate any function. But that's most likely false because there are many weird functions that are impossible to compute, not even approximate (I'm talking about those curious counter-examples in math).
1 comments

This is a non sequitur in this context. The universality described here depends only on changing connection weights, not the neuronal activation functions. An important caveat is the approximated function must be continuous, but that covers a very large family.
I don't think every continuous function can be approximated this way because we can make an infinitely complex, but continuous function that would have any n-th derivative also continuous. I'm thinking about those weird zeta-riemann-style functions. In order to approximate such a function we'd need a huge model that couldn't be computed or stored even by a universe-size perfect computer.
It’s a theorem, so it’s been proven: https://en.wikipedia.org/wiki/Universal_approximation_theore...

Another caveat that I forgot in my previous comment is the domain has to be compact (closed and bounded). But if so, then it doesn’t really matter how weird your continuous function is, because compactness of the domain guarantees uniform continuity, i.e. your delta only depends on epsilon and not x in the epsilon-delta criterion of continuity. That allows you to partition the domain into patches of diameter delta, in which very simple functions are sufficient to approximate within epsilon.