Hacker News new | ask | show | jobs
by behohippy 310 days ago
Used 3090s have been getting expensive in some markets. Another option is dual 5060ti 16 gig. Mine are lower powered, single 8 pin power, so they max out around 180W. With that I'm getting 80t/s on the new qwen 3 30b a3b models, and around 21t/s on Gemma 27b with vision. Cheap and cheerful setup if you can find the cards at MSRP.
2 comments

For comparison, at work we got a pair of Nvidia L4 GPUs: https://www.techpowerup.com/gpu-specs/l4.c4091

That gives us a total TDP of around 150W, 48 GB of VRAM and we can run Qwen 3 Coder 30B A3B at 4bit quantization with up to 32k context at around 60-70 t/s with Ollama. I also tried out vLLM, but the performance surprisingly wasn't much better (maybe under bigger concurrent load). Felt like sharing the data point, because of similarity.

Honestly it's a really good model, even good enough for some basic agentic use (e.g. with Aider, RooCode and so on), MoE seems the way to go for somewhat limited hardware setups.

Ofc obviously not recommending L4 cards cause they have a pretty steep price tag. Most consumer cards feel a bit power hungry and you'll probably need more than one to fit decent models in there, though also being able to game with the same hardware sounds pretty nice. But speaking of getting more VRAM, the Intel Arc Pro B60 can't come soon enough (if they don't insanely overprice it), especially the 48 GB variety: https://www.maxsun.com/products/intel-arc-pro-b60-dual-48g-t...

Yeah 48g, sub 200W seems like a sweet spot for a single card setup. Then you can stack as deep as you want to get the size of model you want for whatever you want to pay for the power bill.
I've hatched a plan to build a light-weight AI model on a $149 mini-pc and host it from my bedroom.

I wonder if I could follow that up by buying a 3090 (jumping the price by $1000 plus whatever I plug it into) and contrasting the difference. Could be an eye opening experiment for me.

Here's the write up of my plan for the cheap machine if anyone is interested.

https://joeldare.com/my_plan_to_build_an_ai_chat_bot_in_my_b...