|
|
|
|
|
by wmwmwm
1482 days ago
|
|
Having just implemented a softmax() function for an online ML course, I think the python implementation here suffers from overflow if any of the elements of z get big(ish) - e.g. e^10000 is a big number! A spot of searching online suggests that subtracting max(z) from all entries in z makes it a lot more robust without changing the result e.g. https://www.tutorialexample.com/implement-softmax-function-w... |
|