|
|
|
|
|
by skzv
554 days ago
|
|
To bring things full circle: the cross-entropy loss is the KL divergence. So intuitively, when you're minimizing cross-entropy loss, you're trying to minimize the "divergence" between the true distribution and your model distribution. This intuition really helped me understand CE loss. |
|
https://stats.stackexchange.com/questions/357963/what-is-the...