Hacker News new | ask | show | jobs
by jszymborski 1711 days ago
I'm very curious as to what part of that process is not explained by the principles by which we understand neural networks to work.

I invite the possibility I've gone this long misunderstanding the definition of "principled" in this context.

1 comments

To me, taking "principled approach" means you understand and can justify the eventual outcome of the approach, or at least guarantee that the outcome satisfies some constraints. How would you justify the number of channels in each layer of a convolutional network? The number of self-attention heads in a transformer? The depth? Can you certify its prediction performance?

Yes, the "just add more layers" approach typically works (in a very narrow sense of the word "works"), but we don't really understand why. We likewise don't understand the failure modes of the system, and cannot engineer around them. Thus it's not really principled in my view.