|
|
|
|
|
by dmytroi
1981 days ago
|
|
Did some research on the topic of high bandwidth/high IOPS file accesses, some of my conclusions could be wrong though, but as I discovered modern NVMe drives need to have some queue pressure on them to perform at advertised speeds, as in hardware level they are essentially just a separate CPU running in background that has command queue(s). They also need to have requests align with flash memory hierarchy to perform at advertised speeds. So that's puts a quite finicky limitation on your access patterns: 64-256kb aligned blocks, 8+ accesses in parallel. To see that just try CrystalDiskMark and put queue depth at 1-2, and/or block size on something small, like 4kb, and see how your random speed plummets. So given the limitations on the access pattern, if you just mmap your file and memcpy the pointer, you'll get like ~1 access request in flight if I understand right. And also as default page size is 4kb, that will be 4kb request size. And then your mmap relies on IRQ's to get completion notifications (instead of polling the device state), somewhat limiting your IOPS. Sure prefetching will help of course, but it is relying on a lot of heuristic machinery to get the correct access pattern, which sometimes fails. As 7+GB/s drives and 10+Gbe networks become more and more mainstream, the main point where people will realize these requirements are in file copying, for example Windows explorer struggles to copy files at rates 10-25GBe+ simply because how it's file access architecture is designed. And hopefully then we will be better equip to reason about "mmap" vs "read" (really should be pread here to avoid the offset sem in the kernel). |
|