| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by maz1b 138 days ago
	AFAIK, they don't have any deals or partnerships with Groq or Cerebras or any of those kinds of companies.. so how did they do this?

3 comments

tcdent 138 days ago

Inference is run on shared hardware already, so they're not giving you the full bandwidth of the system by default. This most likely just allocates more resources to your request.

link

hendersoon 138 days ago

Could well be running on Google TPUs.

link

rvz 137 days ago

The models are running on Google TPUs.

link