|
|
|
|
|
by mlthoughts2018
2951 days ago
|
|
You make good points, and particularly to explain why there might not be much effort to unify approaches, this makes sense. But it still doesn't explain Pearl's generally thorny disposition regarding other approaches. Most practitioners and researchers will err on the side of humility, and assume that broad swaths of comparable research is valuable and that many of their ideas have probably been thought of before, in one form or another, even if the researcher's approach is deserving of praise for its innovation or novelty. David Mumford, in his 'dawning of the age of stochasticity' lecture, mentioned the idea of a 'hubris quotient' -- for him it was the idea of claiming to adequately summarize thousands of years of math progress to the point that someone could actually say something novel, in the span of a single career. If you've only been working on it for 30-40 years, and you're claiming to upend something that's been central for hundreds of years, that's a poor hubris quotient, and so maybe you should proceed with a lot of humility and caution. It just never quite feels like Pearl accepts this for causal inference. Maybe he feels like it has not gotten the attention it deserves and needs to advocate in a more no-nonsense kind of way, but it just seems like somewhat of a bad hubris quotient to start speaking about how it is a novel take on something he feels ML has historically not adequately accounted for. |
|
I would note that Pearl is not necessarily the first or the only person to note that modern machine learning has problems associated with it, problems often described by "correlation is not the same as causation." We can see actual practical problems appear when machine learning systems are deployed in situations they make definite judgments affecting people's lives based only on factors correlated with a condition. In the extreme, if X,Y, Z factor are associated with someone acting criminally, are we allowed to arrest the person without a crime being committed? (etc).
So Pearl has some credibility stepping into this "breach" with his (perhaps sell-branded but) more mathematically grounded and statistically sound approach. Of course, the problem is no statics really gives a "sound" way to unambiguously predict a future datum only from past data. They Bayesian does describe how to make sound predictions when you happen to know prior probabilities, a view that "kicks the can down the road" as others have mentioned.
The thing is, in contrast to math, AI has involved a group of models, theories and ideas which have all broadly moved forward across the decades with their stars rising and falling but few being utterly discarded. This is because little to nothing can be proven and moreover, because despite presented alternatives, they intersect like fat Venn Diagrams if considered only formally (though as specific programs-of-research, they may be exclusive). Moreover, publicity is one key to a given approach getting more concrete implementations and ultimately getting funding, more researchers and chance to go the next generation. The relative speed of a neural net on a GPU might well be a key to this sort of model showing promising practical applications. Is this speed inherent or are other models waiting for optimized implementations? If such an optimized implementation is possible, it would require a specialized programmer and hence funding.
And this means? Well, I'm not sure what it means. Perhaps one could deduce a correct model of machine intelligence if one could determine and correct for the biases which currently drive the process.