|
|
|
|
|
by jostmey
3495 days ago
|
|
Why not KL-Divergence, which measures the error between a target distribution and the current distribution? From the perspective of Information Theory, it is the best error measurement. Oh, and let's not forget that for a lot of problems minimizing the KL-divergence is the exact same operation as maximizing the likelihood function. |
|
it is also extremely poorly behaved numerically and in convergence