Hacker News new | ask | show | jobs
by schoen 437 days ago
It's deliberately made nondeterministic, partly using something called softmax

https://en.wikipedia.org/wiki/Softmax_function

I'd say mainly in order to avoid boring its users.

1 comments

Softmax just normalizes the logit outputs to be positive and sum to 1.0, it doesn't have an effect on determinism.
Thanks, I guess I confused its role with the adjacent step!