Is this the full weight model or quantized version? The GGUFs distributed on Hugging Face labeled as MXFP4 quantization have layers that are quantized to int8 (q8_0) instead of bf16 as suggested by OpenAI.
Example looking at blk.0.attn_k.weight, it's q8_0 amongst other layers: