| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ganfortran 3330 days ago

I think the black-box nature or argument of deep learning lies in the fact of its parameters, not the architecture. Sure, the operations of deep learning, what they do is defined ahead-of-time, but together we still miss the point why it works under some scenarios but fails in another.

Take the recent discussion on reddit/ml for example, people are still debating about whether it should be conv-bn-relu or conv-relu-bn. This is a pretty widely used building block, if not the most widely used one, however, people still don't understand why the latter could work or even outperform the former in a lot applications since it filters out all negative values thus destroying/skewing the underlying distribution for bn. And for BN alone, there is a lot of questions to ask, like the running statistics feels like a hack, however it works very well in reality.

So I take no issue of calling deep learning nowadays a black box. We are far, very far from understanding why this monster does this well in solving so many problems. That is why it is interesting. Some researchers' attitude is confusing to me, because apparently there is a big juicy problem out there, waiting to be cracked, yet, they are distancing themselves away from it.I cannot help thinking it is out of contrarian, that the fear what they have worked for so long may not be useful after all. But true researchers should feel excited for the opportunity to be able to participate when the theory is still vanilla and contribute to it.

1 comments

backpropaganda 3330 days ago

Your example about the debate of BN usage demonstrates that it is possible to look inside a deep network and debate. That we don't know the answers doesn't mean the answers don't exist, or are impossible to find, which is what the term "black-box" suggests.

Of course, more research in tools for model interpretation would be awesome, and my own lab has done a lot towards it, and this remains an important topic. More is desired, but what we have right now is pretty good too, and is not at all inferior to old-school methods, esp. considering the performance.

link

lightcatcher 3330 days ago

I'd argue that a neural net is "black-box" in the sense that nobody really can give a coherent answer to "what happens if I perturb/double/negate this parameter" where the parameter might be deep in some weight matrix. Maybe this isn't a useful question because of the distributed representations within neural nets, but it is at least an answerable question for other models.

Do you know of any work on interpreting neural nets that are being used for non-image tasks?

link