Hacker News new | ask | show | jobs
by VladVladikoff 67 days ago
That's a pretty popular budget friendly GPU people use for local AI, it actually seems like an excellent choice IMHO.
2 comments

Depends on your definition of budget friendly, I suppose. I was looking around the other day and the cheapest working 24GB RTX 3090 on eBay was $1800 CAD after exchange rate, shipping and all the rest.

Hugely inflated from the $700 they were once going for. Maybe there are still deals around.

Actually budget friendly is RTX 3060 12Gb.

With one you can run 9B/12B models which are fine for text tasks like chatting or summarisation. Not for precision like tool calling or code.

With two of them you can run models up to Qwen 27B and 35B with a few-turn context window (8k-16k). Dense at 14t/s and MoE at 68t/s.

With three of them you can run 128k context, though you'll need a large format case and the right motherboard or PCIe riser.

I'm running three and even with a new case this setup cost me less than one 3090.

This seems quite unlikely. What motherboard are you getting three 16x GPUs on? That alone with the associated sever processor would be more than a used 3090, before even buying the three 3060s. Give full BOM and costs.
I already had the PC. I just mean the extra purchase of the graphics cards.

The motherboard is an MSI Pro Z690-A.

The slots are physical x16. Electronically they are x16, x4, x1 which doesn't harm anything at all.

That's insane. I bought two in December for ARS 1.2M (a little less than USD 1000). Maybe OpenClaw raised the demand.
Wild I paid $1000 CAD for mine 2 years ago, I guess things have changed.
Because they are hugely more useful now than running some stupid game at 240 fps instead of 60 fps.
They're not a particularly fast card compared to something like a 5070, they have lots of VRAM.

That's why they were cheap before.

Also "Some stupid game", who woke up and made you king of hobbies.

The only thing that compares to this is probably Mac mini with MLX models.
Radeon 9700 pro or intel arc b70 (both $1000-1400, 32GB, 650GB/s bandwidth), or ryzen AI max 390 (more vram, less bandwidth)

The local inference space is pretty good nowadays.