|
|
|
|
|
by mrtksn
828 days ago
|
|
Assistant API is too much of a beta still. I was about to release an app based on the new Assistant API but just a day before the release the response times increased to 8s flat. When I have function calls, that meant up to a minute to get a response. I had to dismantle everything Assistant API and implement it with Chat API. Which turned out to be great because in Assistant API the context management was very bad and after a few back and forth messages the cost ballooned to over 10K tokens per message. When I looked closely at the Assistant API and Chat API, I noticed that Assistant API is just a wrapper over Chat API and acts as a web service that stores the previous messages(so slow response problem was probably due to the web server which keeps track of the context). So I went ahead and implemented my own Assistant API which has more control. For example, I set max token cost per message and if the context balloons over that, I make a request with the context and ask OpenAI to create a summary with all the facts so far, add that summary as a system prompt and my context gets compressed back into reasonable territory. |
|