| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by segmondy 89 days ago
	it's for fools. i bought 160gb of vram for $1000 last year. 96gb of p40 VRAM can be had for under $1000. And it will run gpt-oss-120b Q8 at probably 30tk/sec

1 comments

timschmidt 89 days ago

P40 is Tesla architecture which is no longer receiving driver or CUDA updates. And only available as used hardware. Fine for hobbyists, startups, and home labs, but there is likely a growing market of businesses too large to depend on used gear from ebay, but too small for a full rack solution from Nvidia. Seems like that's who they're targeting.

link

segmondy 89 days ago

99% of interest is in inference. If you want to fine-tune a model, just rent the best gpu in the cloud. It's often cheaper and faster.

link

timschmidt 89 days ago

Great option if you don't mind sharing your data with the cloud. Some businesses want to own the hardware their data resides on.

link

cootsnuck 89 days ago

How many businesses have the capabilities and expertise to train their own models?

link

timschmidt 89 days ago

No idea. Probably more every day.

link

segmondy 89 days ago

renting GPU, how is that sharing data with the cloud? you can rent GPU from GCP or AWS

link

timschmidt 89 days ago

I suppose if I rent a cloud GPU and just let it sit there dark and do nothing then I wouldn't have to move any data to it. Otherwise, I'm uploading some kind of work for it to do. And that usually involves some data to operate on. Even if it's just prompts.

link

segmondy 89 days ago

So you also believe when you rent a server you are sharing your data with the cloud? AWS and GCP are copying all private data on servers? Give me a break. There's a big difference between renting a server and using an API.

link