|
|
|
|
|
by YetAnotherNick
146 days ago
|
|
Umm. I run multiple benchmark using APIs for my work and the inference time compute allotted has clear correlation with the metrics. But time of the day certainly isn't. If it is that straightforward people can prove very easily rather than relying on the anecdotes. They either overprovision the server during low demand or they might dynamically provision servers based on load. |
|
But no one ever seems to do that, they are rather content to “feel” that this is the case instead