| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by itake 812 days ago
	Our p99 for gpt4 is 3s. Images take up to 50s.

1 comments

so how would you go about improving that?

Not using an LLM for it.

we only send 0.5-5% of traffic to gpt4, thanks to smaller faster cheaper models. So not all of our traffic is hit with 50s latencies :-/