|
|
|
|
|
by CodesInChaos
2831 days ago
|
|
You need to perform some kind of normalization, since probability must be between 0 and 1 (and being wrong on a confident prediction gives huge penalties using the popular maximum likelyhood loss functions). But you can use component wise normalization (sigmoid) instead of combined normalization (softmax). These correspond to the assumption that the classes are independent (component wise sigmoid) or mutually exclusive (softmax). |
|
"and being wrong on a confident prediction gives huge penalties using the popular maximum likelyhood loss functions" - It should.