| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dchuk 1174 days ago

So if I have a 32GB RAM Macbook Pro, and the instructions say this:

"Vicuna-13B This conversion command needs around 60 GB of CPU RAM."

Does this mean I simply cannot run that model at all? Or will it rip into HD swap or something to make the model weights and just take forever?

5 comments

acchow 1173 days ago

Can someone explain why computing a delta needs to hold the entire model at once? Can't it just do one layer at time?

link

FLT8 1172 days ago

Vicuna-13B loads and idles at ~26GB RAM usage on a M1Max/64GB. When answering questions, that grows to around 75GB, and yes, you can feel it (and the machine) slow down significantly when it starts hitting swap. I think realistically you'd be wanting to stick to the 7B model on a 32G machine (even if you could get the weight deltas to apply correctly).

link

UncleOxidant 1173 days ago

I just reached that step on my Linux laptop which has 32GB of RAM. I'm about to give it a try anyway, but I'm not hopeful based on that comment.

I'm wondering if anyone is torrenting these Vicuna-13B weights?

link

GaggiX 1173 days ago

Someone really needs to write a script that does not load both entire models into memory to do this.

link

MMMercy2 1173 days ago

You can try the smaller 7B version.

link