Hacker News new | ask | show | jobs
by pX0r 2918 days ago
A logistic regression model is typically used for a classification task. 'Fitting a logistic model' entails finding optimal coefficients / 'weights' of input features such that classification error is minimised.

For a binary classification task, one could simply calculate mean squared error between predicted values and actual labels (as in linear regression) and then proceed to find the optimal weights iteratively using gradient descent. But the sigmoid shape of the logistic function makes gradient descent a poor choice of an optimization technique (w.r.t. lack of guarantee of finding a global optimum).

A surer way to find globally optimal weights is using the Newton's method of calculating weight updates. This is a numerical optimization technique that requires one to calculate the 1st and 2nd order derivatives of the error function. The matrix that 'calculates' the 1st order derivative is called a Jacobian and the one that calculates the 2nd order derivative is called a Hessian...