| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by aurareturn 475 days ago
	Around 5x Nvidia A100 80GB can fit 671b Q4. $50k just for the GPUs and likely much more when including cooling, power, motherboard, CPU, system RAM, etc.

2 comments

sgt 475 days ago

So the M3 Ultra is amazing value then. And from what I could tell, an equivalent AMD Epyc would still be so constrained that we're talking 4-5 tokens/s. Is this a fair assumption?

link

adgjlsfhk1 475 days ago

No. The advantage of Epic is you get 12 channels of ram so it should be ~6x faster than a consumer cpu.

link

sgt 475 days ago

I realize that but apparently people are still getting very low tokens/sec on Epyc. Why is that? I don't get it, as on paper it should be fast.

link

Aeolun 475 days ago

The Epyc would only set you back $2000 though, so it’s only a slightly worse price/return.

link

SkiFire13 475 days ago

How many tokens/s would that be though?

link

sgt 475 days ago

That's what I'm trying to get to. Looking to set up a rig, and AMD Epyc seems reasonable but I'd rather go Mac if it's giving many more tokens per second. It does sound like the Mac with M3 Ultra will easily give 40 tokens/s, where as the Epyc is just internally constrained too much, giving 4-5 tokens/s but I'd like someone to confirm that, instead of buying the HW and finding out myself. :)

link

aurareturn 474 days ago

Probably a lot more. Those are server-grade GPUs. We're talking prosumer grade Macs.

I don't know how to calculate tokens/s for H100s linked together. ChatGPT might help you though. :)

link

SkiFire13 473 days ago

Well, ChatGPT quotes 25k-75k tokens/s with 5 H100 (so very very far from the 40 tokens/s), but I doubt this is accurate (e.g. it completly ignored the fact they are linked together and instead just multiplied the estimation of the tokens/s for one H100 by 5).

If this is remotely accurate though it's still at least an order of magnitude more convenient than the M3 Ultra, even after factoring in all the other costs associated with the infrastructure.

link