Hacker News new | ask | show | jobs
by wing-_-nuts 494 days ago
Honestly, the smart play in that case is to buy 2 3090's and connect them with nvlink. Or...and hear me out, at this point you could probably just invest your workstation build budget and use the dividends to pay for runpod instances when you actually want to spin up and do things.

I'm sure there are some use cases for 32gb of vram but most of the cutting edge models that people are using day to day on local hardware fit in 12 or even 8gb of vram. It's been a while since I've seen anything bigger than 24gb but less than 70gb.

2 comments

> most of the cutting edge models that people are using day to day on local hardware fit in 12 or even 8gb of vram.

I'm not sure what your idea of "day to day" use cases are, but models that fit in 12GB of VRAM tend to be good for like autocomplete and not much more. I can't even get those models to chose the right tool at the right time, even less be moderately useful. Qwen2.5-32B seems to be the lower boundary of a useful local model, it'll at least use tools correctly. But then for "novel" (for me) coding, basically anything below O1 is more counter-productive than productive.

Yes I was gonna mention that Qwen model from the deepseek folks as maybe an exception
There is tremendous quality difference between 14b and 32b versions.