| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by icosahedron 1201 days ago

I followed the initial instructions and the 7B model worked just fine.

I tried the supplementary instructions to download some of the models (7B, 13B, and 30B), and it didn't seem to work. The prompt returned nothing after waiting for several minutes.

Is there a way to run just one of the larger models?

2 comments

cocktailpeanut 1201 days ago

I am going to test this out today and roll this out as soon as I can, hopefully tomorrow. stay tuned.

link

Datagenerator 1201 days ago

What's the minimum spec GPU required? NVIDIA only? Any differences between Debian and Fedora Linuxes? RAM required?

link

MacsHeadroom 1201 days ago

This app is CPU only and gets good speeds on even mobile phone CPUs. Minimum RAM required is 5GB.

link

sucram1 1201 days ago

Oh wow, any way to do this on Android yet? That would be fun to tinker with, even if it's just the smaller model. Even my older Note 9 has 6GB.

link

MacsHeadroom 1199 days ago

Yes. Starting with the Facebook versions of LLaMA-7B you just quantize the model to 4bit on your desktop (since it takes 14GB of RAM) and then move it to your phone and follow the Android instructions in the repo. https://github.com/ggerganov/llama.cpp/#android

I've seen dozens of screenshots of it running in termux on androids by now at completely usable speeds.

link

sucram1 1199 days ago

Thank you for the link! Insane that this can run on a phone.

As my current potato computer has 8GB of RAM, I'll ask a friend to do it :-)

link

mrfreed 1201 days ago

What distro and PC specs do you have success with?

link

garyfirestorm 1201 days ago

I ran this on my intel i7-7700k with 32 gig ram. It ran very slow. Almost 1 word per second slow. Not sure if I did something wrong. Distro Ubuntu 22.04

link