Hacker News new | ask | show | jobs
by simonw 628 days ago
The new Realtime Websocket API appears to send back responses within less than a second. It might be just what you want.
1 comments

yes and you can use it in text-text mode if you want. a key benefit is for turn-based usages (where you have running back and forth between user and assistant) you only need to send the incremental new input message for each generation. this is better than "prompt caching" on the chat completions API, which is basically a pricing optimization, as it's actually a technical advantage that uses less upstream bandwidth.