Hacker News new | ask | show | jobs
by raverbashing 3979 days ago
> What makes one kind of neural net 'deep' and are all the other ones suddenly 'shallow'

Number of layers

It's that simple

2 comments

To expand on this some more: for a long time, thanks to Cybenko's theorem[1], people just used 1 hidden layer in their neural networks (also because computing was sloowww..). So, your typical NN architecture was input_layer --> hidden_layer --> output_layer.

Eventually, people realized that you could improve performance by adding more hidden layers. So while theoretically Cybenko was correct, practically stacking a bunch of hidden layers made more sense. These network architectures with stacks of hidden layers were then labelled as "deep" neural networks.

[1] https://en.wikipedia.org/wiki/Universal_approximation_theore...

It is that simple but the more complex story is that when the number of hidden layers exceeds 2, training becomes difficult. Also convnets for example cheat by having the connections between layers be incomplete bipartite graphs (not every node is connected to every other node), usually chosen because of some physical property - for computer vision nearest neighbors - eg.
Use another deep learning network to supervise training of your DLN. You can also use it to supervise itself. It is simple idea invented about decade ago (at least I heard it about decade ago here, in Ukraine).