| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by leereeves 1458 days ago
	A comment below said this model uses fp16 (half-precision). If so, it won't easily run on CPU because PyTorch doesn't have good support for fp16 on CPU.

1 comments

netr0ute 1457 days ago

Parent never claimed it was going to be fast.

link

leereeves 1457 days ago

It would probably just fail with an error "[some function] not implemented for 'Half'"

link

chessgecko 1457 days ago

fp16 models inference just fine in fp32, though I was sorta joking in my original comment, it would potentially take weeks for this to run one input. You're better off trying to make something like huggingface accelerate work (like the comment above), which swaps layers of the model on and off the disk

link