Y
Hacker News
new
|
ask
|
show
|
jobs
by
itake
812 days ago
Our p99 for gpt4 is 3s. Images take up to 50s.
1 comments
spxneo
812 days ago
so how would you go about improving that?
link
freediver
812 days ago
Not using an LLM for it.
link
itake
812 days ago
we only send 0.5-5% of traffic to gpt4, thanks to smaller faster cheaper models. So not all of our traffic is hit with 50s latencies :-/
link