| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by functional_dev 81 days ago
	I did not know, that NVFP4 was handled at the silicon level... until I dug deeper here - https://vectree.io/c/llm-quantization-from-weights-to-bits-g...

1 comments

duffyjp 80 days ago

I still don't think I understand it. I saw those nvfp4 models up by chance yesterday and tried them on my Linux PC with a 5060TI 16gb. Ollama refused to pull them saying they were macOS only.

I assumed it was a meta-data bug and posted an issue, but apparently nvfp4 doesn't necessarily mean nvidia-fp4.

https://github.com/ollama/ollama/issues/15149

link

Patrick_Devine 80 days ago

They are nvidia-fp4 weights, but CUDA support isn't _quite_ ready yet, but we've got that cooking.

link