Hacker News new | ask | show | jobs
by irthomasthomas 321 days ago
If these are FP4 like the other ollama models then I'm not very interested. If I'm using an API anyway I'd rather use the full weights.
1 comments

OpenAI has only provided MXFP4 weights. These are the same weights used by other cloud providers.
Oh, I didn't know that. Weird!
It was natively trained in FP4. Probably both to reduce VRAM usage at inference time (fits on a single H100), and to allow better utilization of B200s (which are especially fast for FP4).
Interesting, thanks. I didn't know you could even train at FP4 on H100s
It's impressive they got it to work — the lowest I'd heard of this far was native FP8 training.