Y
Hacker News
new
|
ask
|
show
|
jobs
by
ancientworldnow
695 days ago
This was trained to be run at FP8 with no quality loss.
1 comments
hislaziness
695 days ago
The model description on huggingface says - Model size - 12.2B params, Tensor type - BF16. Is the Tensor type different from the training param size?
link