|
|
|
|
|
by mjw
3682 days ago
|
|
Their answer is pretty much 'because it's based on the log-odds', which to me is still only very mild motivation. There are other non-linearities which people use to map onto (0, 1), for example probit regression uses the Normal CDF. In fact you can use the CDF of any distribution supported on the whole real line, and the sigmoid is an example of this -- it's the CDF of a standard logistic distribution [1]. There's a nice interpretation for this using an extra latent variable: for probit regression, you take your linear predictor, add a standard normal noise term, and the response is determined by the sign of the result. For logistic regression, same thing except make it a standard logistic instead. This then extends nicely to ordinal regression too. [0] https://en.wikipedia.org/wiki/Probit_model
[1] https://en.wikipedia.org/wiki/Logistic_distribution |
|
It's certainly not the only option though, and not always the best fit.