| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mike_hearn 793 days ago
	It reduces the need. If they can get non-latency sensitive users onto this API then they only need to be provisioned to support their max interactive query load (ChatGPT) rather than peak API load, which can be arbitrary high (however fast the program generating the load can run). The lower pricing should move users across quite fast, and the higher efficiency will free up hardware and reduce the rate at which they need to grow it.

1 comments

tony_cannistra 792 days ago

That's the way it seems to me as well. Curious too about the business implications. My guess is that they wanted to bite the bullet and commit to provisioned capacity but wanted to do so in a way that didn't require massive overprovisioning for API requests.

link

mike_hearn 792 days ago

They're well beyond that point now I guess. MS has been building whole datacenters just for OpenAI.

link