|
|
|
|
|
by kelseyfrog
784 days ago
|
|
At some point the logits at a branching point in the response need to correspond to the respective probabilities of the requested output classes so that they can be appropriately sampled and strongly condition the remainder of the response. My instinct says this cannot be accomplished irrespective of temperature, but I could be persuaded. with math. |
|
Expectation: 80% left, 20% right
Model sampling probability: 99% left, 1% right
>>> 0.80 * math.log(0.99 / 0.80) + 0.20 * math.log(0.01 / 0.20)
-0.42867188234223175
Model sampling probability: 90% left, 10% right
>>> 0.80 * math.log(0.9 / 0.80) + 0.20 * math.log(0.1 / 0.20)
-0.04440300758688229
Of course, if you change the temperature this will break any probablistic expectations from training in this manner.