| HN Mirror

Yes. I haven't touched this since 1995, so I had to refresh my memory. I was indeed, talking about Sprecher's modification. Back when I studied this, the proofs I found were not constructive.

I was unaware, but apparently Gribel gave a constructive proof in 2009 (link from Wikipedia article about KA rep theorem). I would have to read it and hope I am not too rusty to understand it before I could really ponder your question...

But I could offer two places I would have looked:

1. The approximation is of a continuous function, and such approximations (e.g. chebychev, bernstein) usually require that you be able to sample the function at specific points - but learning usually gives you training data that does not correspond to those specific points. It's possible that construction fails here somehow.

2. The approximation is too hard in practice. This is the too often the case for Breiman's beautiful ACE (Alternating Conditional Expectation) which, if you squint hard enough, looks like a two-layer network where each neuron has its own transfer function. The algorithm is incredibly simple in theory, but very hard to use in practice.