|
|
|
|
|
by MacsHeadroom
1199 days ago
|
|
Yes. Starting with the Facebook versions of LLaMA-7B you just quantize the model to 4bit on your desktop (since it takes 14GB of RAM) and then move it to your phone and follow the Android instructions in the repo. https://github.com/ggerganov/llama.cpp/#android I've seen dozens of screenshots of it running in termux on androids by now at completely usable speeds. |
|
As my current potato computer has 8GB of RAM, I'll ask a friend to do it :-)