| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by simonw 785 days ago
	Are you talking about the Hugging Face Python libraries, the Hugging Face hosted inference APIs, the Hugging Face web interfaces, the Hugging Face iPhone app, Hugging Face Spaces (hosted Docker environments with GPU access) or something else?

1 comments

p1esk 785 days ago

I updated my comment above: I’m using HF transformers repo, which gets models from HF hub.

link

simonw 785 days ago

Do you have an NVIDIA GPU? I have not had much luck with the transformers library on a Mac.

link

p1esk 785 days ago

Of course. I thought Nvidia GPUs are pretty much a must have to play with DL models.

link

objektif 785 days ago

Well being able to run these models on CPU was pretty much the revolutionary part of llama.cpp.

link

p1esk 785 days ago

I can run them on CPU - HF uses plain Pytorch code - fully supported on CPU.

link

tmostak 784 days ago

But it's likely to be much slower than what you'd get with a backend like llama.cpp on CPU (particularly if you're running on a Mac, but I think on Linux as well), as well as not supporting features like CPU offloading.

link

wkat4242 785 days ago

Ollama supports many radeons now. And I guess llama.cpp does too, after all it's what ollama uses as backend.

link

p1esk 785 days ago

PyTorch (the underlying framework of HF) supports AMD as well, though I haven’t tried it.

link