Y
Hacker News
new
|
ask
|
show
|
jobs
by
boroboro4
534 days ago
Because INT4 quantized weights still use FP16 compute in most cases. Sometimes it's possible to use FP8/INT8 compute, and there is research to use INT4 compute, but it's rather rare.