Y
Hacker News
new
|
ask
|
show
|
jobs
by
theanonymousone
7 days ago
But the huggingface link mentions BF16, F16, and I32?
2 comments
kouteiheika
7 days ago
Not every weight is quantized. For example, those weights which don't take much space
or
are highly important are left in higher precision. State-of-art quantization of weights is never done uniformly (i.e. to
all
weights and in the same way).
link
zackangelo
7 days ago
I don't believe safetensors has a native int4 dtype, so they packed 4 int4s into a bf16 in this checkpoint.
link