Hacker News new | ask | show | jobs
by esquire_900 1112 days ago
Some people use second hand P40 GPUs, which go for around 200-300$. Combine 3 of them with SLI and you've got 72GB of VRAM for less then $1000
2 comments

I do use a P40 for my machine learning box, but I'm curious how you put three on the same system, given they need a CPU power plug and a pci-e port. Then, to cool them, you need to plug your own cooling system, requiring more specific power plugs to be available. What kind of chassis, motherboard, power unit you use to do that? It'll certainly will cost more than $1000 anyway, especially since you also need a decent amount of RAM to preload the models before you move them to the GPUs.
http://nonint.com/ has some interesting posts about how he build a custom server to house 8 GPU's (3090's in this case). You're right that that will set you back more than $1000, though I was only referring to the GPU's themselves.
Woah, that's a cool direction. Thank you! I'll explore this.
P40s are kind of a meme. Using ggmls has roughly the same performance at a fraction of the wattage on a dual-channel DDR5 system.

I still use GPTQ for 30B, but even CPU generates quickly enough at q5_1 on modern hardware.