Hacker News new | ask | show | jobs
by com2kid 357 days ago
Does OpenAI on azure still have that insane latency for content filtering? Last time I checked it added a huge # to time to first token, making azure hosting for real time scenarios impractical.
1 comments

Yes.

Unless you convince MS to let you at the "Provisioned Throughput" model. Which also requires being big enough for sales to listen to you.