|
|
|
|
|
by Gcam
888 days ago
|
|
Thanks for the feedback and glad it is useful! Yes, agree might better representative of future use.
I think a view of variance would be a good idea, currently just shown in over-time views - maybe a histogram of response times or a box and whisker.
We have a newsletter subscribe form on the website or twitter (https://twitter.com/ArtificialAnlys) if you want to follow future updates |
|
Would be interesting to see request latency and throughput when API calls occur cold (first data point), and once per hour, minute, and per second with the first N samples dropped.
Also, at least with Azure OpenAI, the AI safety features (filtering & annotations) make a significant difference in time to first token.