Hacker News new | ask | show | jobs
by anirudhjoshi 5356 days ago
"If I had to guess, humans end up implementing something like the reverse of a typical boosting algorithm, in that we take a bunch of too-strong pattern recognizing subunits, and then put them together into something that pits them against each other to become more robust against mis-prediction"

"Ensemble methods" seems to be what you're talking about. ( http://en.wikipedia.org/wiki/Ensemble_learning )

The application of many models put together to produce one signal to accurately predict the future.

I believe the Netflix challenge was won using ensemble methods acting in concert.

"Our final solution (RMSE=0.8712) consists of blending 107 individual results. Since many of these results are close variants, we first describe the main approaches behind them. "

( PDF paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.142... )

QIM, a large hedge fund that works futures, also uses the same model.

"In more direct language, Woodriff uses a statistical technique called the ensemble method, which is a way of mining data to produce something akin to the wisdom of crowds. A bundle of computer models, each searching for patterns in different ways, are linked together to produce a consensus statistical prediction—a sort of prediction by algorithmic committee. Scientists use the method to help predict ozone levels, for example. Woodriff uses it to help predict where futures markets are headed over a 24-hour period. His predictions are derived from four basic bits of historical pricing information: the open, close, high and low of specific markets.

Rishi Narang, whose Telesis Capital is a longtime investor in QIM, says other fund managers use similar methods and techniques. "The core idea is not so magical," Narang says. "It is how he puts it together. Getting the program correct is very challenging."

( http://www.absolutereturn-alpha.com/Article/2361672/QIMs-Jaf... )

1 comments

Boosting, which he mentioned, is an ensemble method so I assume the parent is familiar with them.

Ensemble methods incorporate multiple weak classifiers and work to make them stronger. I think the parent was thinking of the reverse of this, although that idea seems pretty alien to me.

Yes, I'm familiar with ensemble methods, I use them a lot for classification. But those are not really what I'm thinking about (I'm still groping towards concrete ideas here, so forgive me if the following is a bit vague). Perhaps my saying "the reverse of boosting" is not really an accurate way to put this, in retrospect, so let me clarify.

Ensemble methods typically take several distinct (either by method or training) weak learners and combine the predictions to get one strong hybrid by smoothing, averaging, or otherwise combining the results. They are still vulnerable to overtraining, though, and they're not very good at generalizing from small amounts of data because the individual weak learners don't learn from each other or from context.

My theory is that we might be able to get rid of the ensemble and tolerate massive overtraining without detriment if instead of merely combining results, we took a recursive approach and let the classifier use its output as input at another level. My thought is that overtraining on some patterns could be mollified by the ability to recognize error due to overtraining as a pattern at a different depth of recursion.

This obviously would not be generally applicable to weak learners, it would only apply to a particular subset of learners, and that's where my thoughts get a lot muddier and speculative.

My really wild speculation: in the limit, if you set something like this up in the right way, you might be able to come up with an efficient approximation to Solomonoff induction as restricted to the subset of patterns that you're actually exposed to, rather than over the entire set of possible inputs. If I'm correct about that, it would enable staggeringly effective learning within a domain, as long as the domain itself displayed patterns that had some sort of underlying order.

But I don't have any codez to show, or really anything more than a hunch at this point, so don't take me too seriously. :)

Indeed. The closest I can think of to what he is saying is pareto coevolution