| HN Mirror

> As soon as you add activations and layers, you're improving on SVD/PCA

You're expanding the space of realizable functions, which is an improvement in a specific sense, but not in all senses! The SVD, since it is better understood theorist theoretically, is a more straightforward problem to solve robustly. There are fewer hyperparameters (like learning rate) to choose, and you aren't left wondering whether your solution is at a bad local minimum.

I think it's wrong to think that it's an obvious improvement.