Hacker News new | ask | show | jobs
by calf 821 days ago
Or, we don't surely know what deep nets are doing. If I give you an LLM or AlphaGo, you cannot look at it and tell me what it does. It's a bunch of parameters and edge weights. The counterargument is something like, deep nets are overparameterized and the gradient descent process does not reflect the final result. You would think that the large infinities of correct/incorrect 3D models are impossible to choose from, but in practice some have found emergent structural properties - like board positions, formal grammar fragments, etc. - enough to at least suggest that we don't understand how they work, and that it is a conflation/reductive error to call deep nets the same kernel or statistical machines as before.

The above isn't my own argument, as I'm not an expert. But theoreticians have been looking at this, and the ones posing this counterargument come from outside the ML community/Google/OpenAI so you can't attack this argument for being the wild delusions of ML researchers either. The lectures I watched was by an IAS professor in theoretical computer science, not ML people. Another professor's lecture I started watching has a background in signal theory and probability/statistics, if even he says "we don't know what's going on with deep learning", I tend to give that some credence and update my own uncertainty.

Now, I get your argument in that you are repeating everything Chomsky has said regarding explainability, evolution of human cognition and "truth of the world", statistical machines being fed a corpus of human-understandable information be it Internet text or Go game moves. Chomsky's criticism of ML-based "AI" covers all of this and I don't see your argument as introducing anything different from his (feel free to correct me if I've misread your remarks). I myself actually started on his side, now I'm a little on the fence and can see both sides more clearly.