| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by tony_cannistra 835 days ago
	Yeah this makes sense. I do wonder though how it changes the dynamics around provisioned capacity, if at all.

1 comments

mike_hearn 835 days ago

It reduces the need. If they can get non-latency sensitive users onto this API then they only need to be provisioned to support their max interactive query load (ChatGPT) rather than peak API load, which can be arbitrary high (however fast the program generating the load can run). The lower pricing should move users across quite fast, and the higher efficiency will free up hardware and reduce the rate at which they need to grow it.

link

tony_cannistra 835 days ago

That's the way it seems to me as well. Curious too about the business implications. My guess is that they wanted to bite the bullet and commit to provisioned capacity but wanted to do so in a way that didn't require massive overprovisioning for API requests.

link

mike_hearn 835 days ago

They're well beyond that point now I guess. MS has been building whole datacenters just for OpenAI.

link