Hacker News new | ask | show | jobs
by simjnd 46 days ago
I don't think any models are natively INT4? I wouldn't see the point to nerf the model out-of-the-box.
2 comments

It's not nerfed, it's natively trained at that quantization a.k.a. Quantization Aware Training.
QAT typically uses BF16/FP32 during the training process to simulate lower precision.
The only model I have seen like that is GPT OSS, natively quantized to MXFP4.