|
|
|
|
|
by coder543
757 days ago
|
|
I had already read the comment I was responding to, and they actually mentioned both. Here's the exact quote for the 7B: "Even running a 7B will take 14GB if it's fp16." Since they called out a specific amount of memory that is entirely irrelevant to anyone actually running 7B models, I was responding to that. I'm certain that no one at Microsoft is talking about running 70B models on consumer devices. 7B models are actually a practical consideration for the hardware that exists today. |
|
Which is correct, fp16 takes two bytes per weight, so it will be 7 billion * 2 bytes which is exactly 14GB.
They are probably aware that you could run it with 4 bit quantization (which would use 1/4 of the RAM) but explicitly mentioned fp16.