Hacker News new | ask | show | jobs
by joe_the_user 2952 days ago
I wish more effort had been made by e.g. Pearl to look into this and unify his approach with what had already been thought of, especially because it turns me off a lot when someone tries to create a "whole new paradigm" and it starts to feel like they want to generate sexy marketing hype about it, rather than to say hey, this is an extension or connection or alternative of this older idea already in the topic of machine learning...

I think you wind-up with a situation where the none of the less-than mainstream of conceptions intelligence will have further parts added. Instead, each becomes associated with a single individual's career. It's something of the nature of academia, a situation that made sense when scientific models and approaches were "small" enough to be fully encompassed by an individual.

But you have the problem models aren't naturally modular. Whether X model extend Y model is something of a judgment call. What makes one like or not-like another model is a matter of both the structure of the model and the reasoning behind the model.

Moreover, consider ten programmers creating one computer program tend to proportionately less productive than one programmer creating a program (ie, they work much less than 10x as fast as a rule). Ten theorists putting together one single theory may face a similar or greater problem of diminishing return and coordination.

2 comments

The development of Quantum Field Theory is a good example where >10 people all collaborated to come up with a framework that integrated the viewpoints of multiple theorists with radically different approaches, rather than every new contributor forking a personalized version of the previous theory.

Consider, for example, the way Freeman Dyson combined the graphical approach of Feynmann with Schwinger's more formal methods.

The development of Quantum Field Theory is a good example where >10 people all collaborated to come up with a framework that integrated the viewpoints of multiple theorists with radically different approaches, rather than every new contributor forking a personalized version of the previous theory.

Sure, I hope I was clear that I don't ten theorist (or ten programmers) collaborating is impossible. I would simply say that collaborating has an extra cost to it - and a competitive academic world, any cost needs some degree of payoff. This makes extending a mainstream theory advantageous but not so much less-known theories.

And Quantum field theory had the advantage that the experiments for demonstrating it's truth or falsehood were relatively straightforward. With AI, the question of a theories truth is more debatable.

You make good points, and particularly to explain why there might not be much effort to unify approaches, this makes sense.

But it still doesn't explain Pearl's generally thorny disposition regarding other approaches. Most practitioners and researchers will err on the side of humility, and assume that broad swaths of comparable research is valuable and that many of their ideas have probably been thought of before, in one form or another, even if the researcher's approach is deserving of praise for its innovation or novelty.

David Mumford, in his 'dawning of the age of stochasticity' lecture, mentioned the idea of a 'hubris quotient' -- for him it was the idea of claiming to adequately summarize thousands of years of math progress to the point that someone could actually say something novel, in the span of a single career. If you've only been working on it for 30-40 years, and you're claiming to upend something that's been central for hundreds of years, that's a poor hubris quotient, and so maybe you should proceed with a lot of humility and caution.

It just never quite feels like Pearl accepts this for causal inference. Maybe he feels like it has not gotten the attention it deserves and needs to advocate in a more no-nonsense kind of way, but it just seems like somewhat of a bad hubris quotient to start speaking about how it is a novel take on something he feels ML has historically not adequately accounted for.

Well, I'm not qualified to judge Pearl's integrity.

I would note that Pearl is not necessarily the first or the only person to note that modern machine learning has problems associated with it, problems often described by "correlation is not the same as causation." We can see actual practical problems appear when machine learning systems are deployed in situations they make definite judgments affecting people's lives based only on factors correlated with a condition. In the extreme, if X,Y, Z factor are associated with someone acting criminally, are we allowed to arrest the person without a crime being committed? (etc).

So Pearl has some credibility stepping into this "breach" with his (perhaps sell-branded but) more mathematically grounded and statistically sound approach. Of course, the problem is no statics really gives a "sound" way to unambiguously predict a future datum only from past data. They Bayesian does describe how to make sound predictions when you happen to know prior probabilities, a view that "kicks the can down the road" as others have mentioned.

The thing is, in contrast to math, AI has involved a group of models, theories and ideas which have all broadly moved forward across the decades with their stars rising and falling but few being utterly discarded. This is because little to nothing can be proven and moreover, because despite presented alternatives, they intersect like fat Venn Diagrams if considered only formally (though as specific programs-of-research, they may be exclusive). Moreover, publicity is one key to a given approach getting more concrete implementations and ultimately getting funding, more researchers and chance to go the next generation. The relative speed of a neural net on a GPU might well be a key to this sort of model showing promising practical applications. Is this speed inherent or are other models waiting for optimized implementations? If such an optimized implementation is possible, it would require a specialized programmer and hence funding.

And this means? Well, I'm not sure what it means. Perhaps one could deduce a correct model of machine intelligence if one could determine and correct for the biases which currently drive the process.

This comment doesn't really make much sense for me, especially since none of Pearl's techniques have been convincingly demonstrated to work in real situations. It's one thing to take pot shots at practical engineering problems and point of flaws and locations for improvement, but it's quite different to claim that a new framework would solve them when (a) elements of that framework have already existed a while and practitioners knew about them, and (b) the framework hasn't been shown to give state of the art performance or to actually solve cases when algorithmic decision making made improper judgments.

Do you have examples to dispute this... actual examples where a causal inference based model was used for large-scale deployed machine learning problems and demonstrably fixed some type of judgment error that had previously been leading to bad outcomes for people?