| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bt1a 173 days ago
	This is most likely an inference serving problem in terms of capacity and latency given that Opus X and the latest GPT models available in the API have always responded quickly and slowly, respectively