| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by usatie 431 days ago

Thanks for the links — I went through all of them (took me a while). The point about rack density differences between SRAM-based systems like Cerebras or Groq and GPU clusters is now clear to me.

What I’m still trying to understand is the economics.

From this benchmark: https://artificialanalysis.ai/models/llama-4-scout/providers...

Groq seems to offer near lowest prices per million tokens and the near fastest end to end response times. That’s surprising because in my understanding, speed(latency) and the cost are trade-offs.

So I’m wondering: Why can’t GPU-based providers can't offer cheaper but slower(high-latency) APIs? Or do you think Groq/Cerebras are pricing much below cost (loss-leader style)?

1 comments

latchkey 430 days ago

Loss leader. It is uber/airbnb. Book revenue, regardless of economics, and then debt finance against that. Hope one day to lock in customers, or raise prices, or sell the company.

link