| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by tanelpoder 705 days ago

This reminds me of the Linux/Unix disk busy "%util" metric in tools like sar and iostat. People sometimes interpret the 100%util as a physical ceiling for the disk IO capacity, just like with CPUs ("we need more disks to get disk I/O utilization down!").

It is a correct metric when your block device has a single physical spinning disk that can only accept one request at a time (dispatch queue depth=1). But the moment you deal with SSDs (capable of highly concurrent NAND IO), SAN storage block devices striped over many physical disks or even a single spinning disk that can internally queue and reorder IOs for more efficient seeking, just hitting 100%util at the host block device level doesn't mean that you've hit some IOPS ceiling.

So, looks like the GPU "SM efficiency" analysis is somewhat like logging in to the storage array itself and checking how busy each physical disk (or at least each disk controller) inside that storage array is.