| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

1 comments

I don't think any models are natively INT4? I wouldn't see the point to nerf the model out-of-the-box.

It's not nerfed, it's natively trained at that quantization a.k.a. Quantization Aware Training.

QAT typically uses BF16/FP32 during the training process to simulate lower precision.

The only model I have seen like that is GPT OSS, natively quantized to MXFP4.