| The issue I have with SSE and what is being proposed in this article (which is very similar), is the very long lived connection. OpenAI uses SSE for callbacks. That works fine for chat and other "medium" duration interactions but when it comes to fine tuning (which can take a very long time), SSE always breaks and requires client side retries to get it to work. So, why not instead use something like long polling + http streaming (a slight tweak on SSE). Here is the idea: 1) Make a standard GET call /api/v1/events (using standard auth, etc) 2) If anything is in the buffer / queue return it immediately 3) Stream any new events for up to 60s. Each event has a sequence id (similar to the article). Include keep alive messages at 10s intervals if there are no messages. 4) After 60s close the connection - gracefully ending the interaction on the client 5) Client makes another GET request using the last received sequence What I like about this is it is very simple to understand (like SSE - it basically is SSE), has low latency, is just a standard GET with standard auth and works regardless of how load balancers, etc., are configured. Of course, there will be errors from time to time, but dealing with timeouts / errors will not be the norm. |
> SSE always breaks and requires client side retries to get it to work
Yeah, but these are automatic (the browser handles it). SSE is really easy to get started with.