|
|
|
|
|
by aetherson
4876 days ago
|
|
TL;DR: Average anything is a terrible way to track anything. (And median or mode are bad, too). Any single-scalar value that compresses information that is best expressed as a graph (or multiple graphs!) is immensely lossy to the point where arguably it obfuscates more than it makes clear. Back when we had to live with sort of printing-press methods of displaying information (ie, where anything that wasn't pure text was very difficult to display), mean/median/mode numbers were a necessary evil. But if you're looking at a computer screen, there's really no reason to subject yourself to an abstraction that throws out 90% of your data. |
|
Showing the full histogram isn't a full solution either, though. Not only does using the average latency obscure the issue by boiling it down to a single scalar, but the full histogram of latencies also loses the information on note-to-note consistency! That's because a latency histogram loses sequencing information, so it doesn't distinguish between the case where you had a lot of 20ms latencies in a row followed by a lot of 50ms latencies in a row, and the case where every other message oscillated between 20ms and 50ms latencies (much worse). You can try to capture some of that information by making a histogram of adjacent-latency deltas, as one attempt. Or you can capture a different view on it by plotting latency vs. time and looking for spikes (but that can obscure less-obvious trends, and is unwieldy as a data representation if you're trying to summarize a system's behavior over a period of hours).
The paper is here, though the actual numbers are 9 years old at this point, so probably not that useful: http://www.cs.hmc.edu/~bthom/res/midi_timing/publications/IC...