|
|
|
|
|
by kettleballroll
492 days ago
|
|
I thought the temperature only affects randomness at the end of the network (when turning embeddings back I to words using the softmax). It cannot influence routing, which is inherently influenced by which examples get batched together (ie, it might depend on other users of the system) |
|