| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Gcam 888 days ago
	Thanks for the feedback and glad it is useful! Yes, agree might better representative of future use. I think a view of variance would be a good idea, currently just shown in over-time views - maybe a histogram of response times or a box and whisker. We have a newsletter subscribe form on the website or twitter (https://twitter.com/ArtificialAnlys) if you want to follow future updates

1 comments

AaronFriel 888 days ago

Variance would be good, and I've also seen significant variance on "cold" request patterns, which may correspond to resources scaling up on the backend of providers.

Would be interesting to see request latency and throughput when API calls occur cold (first data point), and once per hour, minute, and per second with the first N samples dropped.

Also, at least with Azure OpenAI, the AI safety features (filtering & annotations) make a significant difference in time to first token.

link