Y
Hacker News
new
|
ask
|
show
|
jobs
by
maz1b
138 days ago
AFAIK, they don't have any deals or partnerships with Groq or Cerebras or any of those kinds of companies.. so how did they do this?
3 comments
tcdent
138 days ago
Inference is run on shared hardware already, so they're not giving you the full bandwidth of the system by default. This most likely just allocates more resources to your request.
link
hendersoon
138 days ago
Could well be running on Google TPUs.
link
rvz
137 days ago
The models are running on Google TPUs.
link