|
|
|
|
|
by solaxun
679 days ago
|
|
On the latency of the first request - How is the CFG cached? Is it done at the API Key + schema level? Meaning that for a given API key, the latency penalty for a new schema is only paid one time, regardless of how far apart requests are? Or is cached with less duration, e.g. each session, conversation thread, etc? |
|