Hacker News new | ask | show | jobs
by logicallee 1247 days ago
> if you have GPUs with > 330GB VRAM, it'll run fast

What kind of GPU's have that that are available to consumers, how much would such a kit cost roughly?

2 comments

He means multiple GPUs in parallel that have a combined VRAM of that size. So around 4 x NVIDIA A100 80GB, which you can get for around $8.4 / hour in the cloud. or 7 x NVIDIA A6000 or A40 48GB for $5.5 / hour

So not exactly cheap or easy yet for the everyday user, but I believe the models will become smaller and more affordable to run, these are just the "first" big research models focused demonstrating some usefulness after that they can be more focus on the size and speed optimizations. There are multiple methods and lot of research into making them smaller with distilling them, converting to lower precision, pruning the less useful weights, sparsifying. Some achieve around 40% size reduction 60% speed improvement with minimal accuracy loss, others achieve 90% sparsity. So there is hope to run them or similar models on a single but powerful computer.

You'd basically need a rack mount server full of Nvidia H100 cards (80 Vram, they cost $40 thousand us dollars each). So... good luck with that? On the relatively cheap end Nvidia tesla cards are kinda cheap used, 24 gig ones going for ~$200 with architectures from a few years ago. That's still nearly $3000 worth of cards not counting the rest of the whole computer. This isn't really something you can run out home without having a whole "operation" going on.
got it, thanks.