Hacker News new | ask | show | jobs
by N_A_T_E 618 days ago
I just need their API to be faster. 15-30 seconds per request using 4o-mini isn't good enough for responsive applications.
4 comments

You should try Azure: it comes with dedicated capacity which is typically a very expensive "call our sales team" feature with OpenAI
The new Realtime Websocket API appears to send back responses within less than a second. It might be just what you want.
yes and you can use it in text-text mode if you want. a key benefit is for turn-based usages (where you have running back and forth between user and assistant) you only need to send the incremental new input message for each generation. this is better than "prompt caching" on the chat completions API, which is basically a pricing optimization, as it's actually a technical advantage that uses less upstream bandwidth.
That is odd. Longest I’ve experienced in my use of it is a few seconds.
That doesn’t match my experience using it a lot at all