|
|
|
|
|
by dlewis1788
1086 days ago
|
|
My understanding is for certain types of networks BF16 will train better than FP16, given the additional protection against exploding gradients and loss functions with the extended range of BF16 - at the loss of precision. |
|