| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hackerlight 836 days ago
	What's the cost per inference relative to H100? Isn't that the number to care about?

2 comments

hobofan 835 days ago

Based on some rough ballpark conservative estimates (one server with 2 A100 at $50000; 50 tokens/s one one of those servers; so 10 of those servers), upfront cost with consumer hardware seems to be 1/10 to 1/20 of what the Groq hardware costs. I would guess that realistically cloud providers can probably achieve half to 1/3 of that price

So unless you need the fast latency of Groq, consumer hardware seems to be a lot cheaper for the same thoughput.

link

542458 836 days ago

If you believe the marketing material it’s lower. Their API is the cheapest around, so either it’s true or they’re subsidizing.

link

hackerlight 836 days ago

Another consideration: Even if it's slightly more expensive, that can be OK if you care about inference speed. I'd pay 50% more for GPT-4 if it could deliver results that quick.

link