Hacker News new | ask | show | jobs
by YetAnotherNick 146 days ago
Umm. I run multiple benchmark using APIs for my work and the inference time compute allotted has clear correlation with the metrics. But time of the day certainly isn't. If it is that straightforward people can prove very easily rather than relying on the anecdotes.

They either overprovision the server during low demand or they might dynamically provision servers based on load.

1 comments

Yes, every time I see some variant of this come up (and believe me, this has been coming up since before the GPT3.5 days) there’s never any actual data demonstrating that it’s the case. As you say, it should be completely trivial to run the exact same prompt multiple times per day and capture the output to demonstrate this.

But no one ever seems to do that, they are rather content to “feel” that this is the case instead