|
|
|
|
|
by tvural
3495 days ago
|
|
The best explanation is probably that squared error gives you the best fit when you assume your errors should normally distributed. Things like the fact that squared error is differentiable are actually irrelevant - if the best model is not differentiable, you should still use it. |
|
I'm not sure I would say that - neural nets are "near everywhere differentiable", for example. Without differentiability we're stuck with, for example, discrete GAs for optimization, and you can throw all your intuition out the window (not to mention training/learning efficiency).