| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by belval 1040 days ago
	Nice project! I could not find the information in the README.md, can I run this with a GPU? If so what do I need to change? Seems like it's hardcoded to 0 in the run script: https://github.com/getumbrel/llama-gpt/blob/master/api/run.s...

3 comments

dicriseg 1039 days ago

I put up a draft PR to demo how to run it on a GPU: https://github.com/getumbrel/llama-gpt/pull/11

It breaks other things like model downloading, but once I got it to a working state for myself, I figured why not put it up there in case its useful. If I have time, I'll try to rework it a little bit with more parameters and less dockerfile repetition to fit the main project better.

link

mayankchhabra 1040 days ago

Ah yes, running on GPU isn't supported at the moment. But CUDA (for Nvidia GPUs) and Metal support is on the roadmap!

link

samspenc 1040 days ago

Ah fascinating, just curious, what's the technical blocker? I thought most of the Llama models were optimized to run on GPUs?

link

mayankchhabra 1040 days ago

It's fairly straightforward to add GPU support when running on the host, but LlamaGPT runs inside a Docker container, and that's where it gets a bit challenging.

link

stavros 1040 days ago

It shouldn't, nVidia provides a CUDA Docker plugin that lets you expose your GPU to the container, and it works quite well.

link

dicriseg 1039 days ago

See above if you're interested in that. It does work quite well, even with nested virtualization (WSL2).

link

stavros 1039 days ago

I am, thanks!

link

crudgen 1040 days ago

Had the same thought, since it is kinda slow (only have 4 pyhsical/8 logical cores though). But I think vRAM might be a problem (8gb can work, if one has a rather recent gpu (here m1/2 might be interesting)).

link