| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by wirybeige 22 days ago
	DS4 Pro/Flash were post trained with QAT, so they are already quantized to FP4 for the most part. That's why when downloading the weights, they are much smaller than what their weights at fp8 or fp16 would be. For example, Flash is a 284B model, but its GB size is only ~160GB. OFC maybe DeeppInfra went even further, but there is no proof of that.

1 comments

pants2 21 days ago

Interesting then that OpenRouter[1] tags many providers as FP8 and DeepInfra as FP4.

1. https://openrouter.ai/deepseek/deepseek-v4-pro

link

wirybeige 21 days ago

I presume the providers are the ones giving the info to OpenRouter? I mean, technically it is a mix of fp8 and fp4 (although it is predominately fp4), so I don't think either is inaccurate.

link