| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by irthomasthomas 321 days ago
	If these are FP4 like the other ollama models then I'm not very interested. If I'm using an API anyway I'd rather use the full weights.

1 comments

mchiang 321 days ago

OpenAI has only provided MXFP4 weights. These are the same weights used by other cloud providers.

link

irthomasthomas 321 days ago

Oh, I didn't know that. Weird!

link

reissbaker 321 days ago

It was natively trained in FP4. Probably both to reduce VRAM usage at inference time (fits on a single H100), and to allow better utilization of B200s (which are especially fast for FP4).

link

irthomasthomas 321 days ago

Interesting, thanks. I didn't know you could even train at FP4 on H100s

link

reissbaker 319 days ago

It's impressive they got it to work — the lowest I'd heard of this far was native FP8 training.

link