Y
Hacker News
new
|
ask
|
show
|
jobs
by
hislaziness
694 days ago
isn't it 2 bytes (fp16) per param. so 7b = 14 GB+some for inference?
2 comments
ancientworldnow
694 days ago
This was trained to be run at FP8 with no quality loss.
link
hislaziness
694 days ago
The model description on huggingface says - Model size - 12.2B params, Tensor type - BF16. Is the Tensor type different from the training param size?
link
fzzzy
694 days ago
it's very common to run local models in 8 bit int.
link
qwertox
694 days ago
Yes, but it's not common for the original model to be 8 bit int. The community can downgrade any model to 8 bit int, but it's always linked to quality loss.
link