Hacker News new | ask | show | jobs
by minimaxir 439 days ago
Softmax just normalizes the logit outputs to be positive and sum to 1.0, it doesn't have an effect on determinism.
1 comments

Thanks, I guess I confused its role with the adjacent step!