|
|
|
|
|
by throwaway_pdp09
2080 days ago
|
|
I thought queue depth related to supporting outstanding/pending reads. For a serial access such as csv parsing, what would you do other than having a readahead - somehow, see my other question - which would presumably maintain the queue depth at about 1. Put another way, what would you do to read in the CSV serially to increase speed that would push the queue depth above 1? |
|
However, optimal utilization of the drive(s) will always require a queue depth of more than one request, because you don't want the drive to be idle after signalling completion of its only queued command and waiting for the CPU to produce a new read request. In a RAID0 setup like the author describes, you need to also ensure that you're generating enough IO to keep both drives busy, and the minimum prefetch window size that can accomplish this will usually be at least one full stripe.
As for how you accomplish the prefetching: the madvise system call sounds like a good choice, with the MADV_SEQUENTIAL or MADV_WILLNEED options. But how much prefetching that actually causes is up to the OS and the local system's settings. On my system, /sys/block/$DISK/queue/read_ahead_kb defaults to 128, which is definitely insufficient for at least some drives but might only apply to read-ahead triggered by the filesystem's heuristics rather than more explicitly requested by a madvise. So manually touching pages from a userspace thread is probably the safer way to guarantee the OS pages in data ahead of timeāas long as it doesn't run so far ahead of the actual use of the data that it creates memory pressure that might get unused pages evicted.