|
|
|
|
|
by washadjeffmad
1113 days ago
|
|
P40s are kind of a meme. Using ggmls has roughly the same performance at a fraction of the wattage on a dual-channel DDR5 system. I still use GPTQ for 30B, but even CPU generates quickly enough at q5_1 on modern hardware. |
|