Y
Hacker News
new
|
ask
|
show
|
jobs
by
talolard
1726 days ago
I don’t know if it’s correct , but I often think of a classification model as learning the parameters of a dirchlet distribution with the final softmax layer being a sample from it