|
|
|
|
|
by metakermit
3047 days ago
|
|
I like the parallel between optical lens systems and deep learning. I'm also kind of disappointed by the "arcane lore" status hyper-parameters have in different ML domains. I think it would be healthier for the community to make it a habit to explicitly document why a certain topology and layer sizes were selected. It's like providing documentation with your open source project – yes, it would be possible for knowledgable people to use it without it, but much more difficult and beginner unfriendly. |
|
Often, people either reuse other people's architectures, or simply try 2 or 3 and stick with the best one, only changing the learning rate and such.
I also wonder if there's a computation issue (training is long, we can only try so many things), or if it really is that we are working in the wrong hyperparameter space. Maybe there is another space we could be working in, where the HPs that we currently use (learning rate, L2 regularization, number of layers, etc.) are a projection from that other HP space where "things make more sense".