Logistic Regression by Discretizing Continuous Variables via Gradient Boosting

Y	Hacker News new \| ask \| show \| jobs

	Logistic Regression by Discretizing Continuous Variables via Gradient Boosting (cdn.rawgit.com)
	55 points by vincent_123 3166 days ago

3 comments

bitL 3166 days ago

A question about an opposite problem - is there a way to do this (and the whole Deep Learning) on discrete domains? So far all I've seen assumes continuous functions to be able to perform back-propagation; haven't seen anyone using discrete calculus with similar rules to continuous one (see Graham/Knuth/Patashnik). That could open many more interesting applications...

link

Eridrus 3165 days ago

Dealing with discrete variables is trivial, you can just map them into a continuous space and proceed as normal.

Trying to learn discrete rules is harder because the learning procedure uses gradients to adjust parameters, and the gradients will be zero in a lot more places with discrete "rules".

Gradient Boosted Trees are probably the main thing that comes to mind, but they're not really deep learning.

People have tried to learn hard vs soft attention mechanisms, and while hard attention is faster, it results in worse accuracy and is harder to train.

The inference I draw is that most of the things we want to learn are not described well by discrete rules.

link

mcintyre1994 3166 days ago

Can you use the typical approaches to classification? You can define a continuous error function and perform back-propagation using that. If you look at something like Kaggle then deep learning approaches tend to dominate classification challenges just as much as they do regression ones.

link

bitL 3166 days ago

That's the usual approach which has its limits. I am specifically curious about discrete domains. Look at it as at mixed integer programming - yes, you can estimate solution using linear programming, but that estimate is usually useless. Having a specific method for mixed integer programming usually yields far better solutions.

link

digitalzombie 3165 days ago

> discrete domains

Noob here but aren't Bayesian Network DAG just a specialize Neural Network? If so you can use Dirichlet Distribution for Bayesian Network and that's discrete... Unless I'm misunderstanding.

link

phunge 3166 days ago

In cases where you need to interpret the resulting model, I've been advised not to bin (for example: http://biostat.mc.vanderbilt.edu/wiki/Main/CatContinuous). Other alternatives are splines or generalized additive models.

link

vincent_123 3166 days ago

Thank you for the comment and the link. I agree with most of the points listed there. And GAM is a great tool when there is non-linear and non-monotonic relation between the response and independent variables. GAM has good interpretability but it is still somehow difficult to understand in some business environment. For example, in credit scoring, logistic regression with binning is still widely applied.

link

closed 3166 days ago

In my experience, most the time people use binning, it's straightforward to demonstrate that their binning+model is equivalent to restricted forms of more general models (e.g. common general additive / structural equation model). Sometimes binning is useful, because it makes them much easier to estimate.

However, people's rationales for why they should bin is often that it makes the model better / more interpretable, without actually testing the more restricted binned model against the more general one. There's certainly something to be said for knowing your audience when choosing a model, though :).

link

hyperbovine 3166 days ago

Logistic regression is in fact obtained by discretizing continuous variables with logistically distributed errors. This is the "threshold model". If you assume normal instead of logistic you get the probit.

link