|
|
|
|
|
by ganfortran
3330 days ago
|
|
I think the black-box nature or argument of deep learning lies in the fact of its parameters, not the architecture. Sure, the operations of deep learning, what they do is defined ahead-of-time, but together we still miss the point why it works under some scenarios but fails in another. Take the recent discussion on reddit/ml for example, people are still debating about whether it should be conv-bn-relu or conv-relu-bn. This is a pretty widely used building block, if not the most widely used one, however, people still don't understand why the latter could work or even outperform the former in a lot applications since it filters out all negative values thus destroying/skewing the underlying distribution for bn. And for BN alone, there is a lot of questions to ask, like the running statistics feels like a hack, however it works very well in reality. So I take no issue of calling deep learning nowadays a black box. We are far, very far from understanding why this monster does this well in solving so many problems. That is why it is interesting. Some researchers' attitude is confusing to me, because apparently there is a big juicy problem out there, waiting to be cracked, yet, they are distancing themselves away from it.I cannot help thinking it is out of contrarian, that the fear what they have worked for so long may not be useful after all. But true researchers should feel excited for the opportunity to be able to participate when the theory is still vanilla and contribute to it. |
|
Of course, more research in tools for model interpretation would be awesome, and my own lab has done a lot towards it, and this remains an important topic. More is desired, but what we have right now is pretty good too, and is not at all inferior to old-school methods, esp. considering the performance.