Y
Hacker News
new
|
ask
|
show
|
jobs
by
p1esk
785 days ago
I updated my comment above: I’m using HF transformers repo, which gets models from HF hub.
1 comments
simonw
785 days ago
Do you have an NVIDIA GPU? I have not had much luck with the transformers library on a Mac.
link
p1esk
785 days ago
Of course. I thought Nvidia GPUs are pretty much a must have to play with DL models.
link
objektif
785 days ago
Well being able to run these models on CPU was pretty much the revolutionary part of llama.cpp.
link
p1esk
785 days ago
I can run them on CPU - HF uses plain Pytorch code - fully supported on CPU.
link
tmostak
785 days ago
But it's likely to be much slower than what you'd get with a backend like llama.cpp on CPU (particularly if you're running on a Mac, but I think on Linux as well), as well as not supporting features like CPU offloading.
link
p1esk
785 days ago
Are there benchmarks? 2x speed up would not be enough for me to return to c++ hell, but 5x might be, in some circumstances.
link
wkat4242
785 days ago
Ollama supports many radeons now. And I guess llama.cpp does too, after all it's what ollama uses as backend.
link
p1esk
785 days ago
PyTorch (the underlying framework of HF) supports AMD as well, though I haven’t tried it.
link