| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by wing-_-nuts 494 days ago
	Honestly, the smart play in that case is to buy 2 3090's and connect them with nvlink. Or...and hear me out, at this point you could probably just invest your workstation build budget and use the dividends to pay for runpod instances when you actually want to spin up and do things. I'm sure there are some use cases for 32gb of vram but most of the cutting edge models that people are using day to day on local hardware fit in 12 or even 8gb of vram. It's been a while since I've seen anything bigger than 24gb but less than 70gb.

2 comments

diggan 494 days ago

> most of the cutting edge models that people are using day to day on local hardware fit in 12 or even 8gb of vram.

I'm not sure what your idea of "day to day" use cases are, but models that fit in 12GB of VRAM tend to be good for like autocomplete and not much more. I can't even get those models to chose the right tool at the right time, even less be moderately useful. Qwen2.5-32B seems to be the lower boundary of a useful local model, it'll at least use tools correctly. But then for "novel" (for me) coding, basically anything below O1 is more counter-productive than productive.

wing-_-nuts 494 days ago

Yes I was gonna mention that Qwen model from the deepseek folks as maybe an exception

pshirshov 492 days ago

There is tremendous quality difference between 14b and 32b versions.