io_uring dumps data directly into a ring buffer mapped into the user-level address space. User code is notified by (at least) an updated atomic counter. The user process must be finished with the data before the kernel comes around again to overwrite it. Often that demands the user process or thread is bound to a core which the OS has been forbidden to run anything else on, and the thread does a carefully circumscribed amount of work, rarely including memory allocation, i/o, or even system calls, that may cause it to be "lapped" by subsequent writes.
The idea is that the average time to process a packet absolutely must not exceed the average arrival rate, and the sum of spikes in arrival rate must average out over the size of the ring buffer to less than the process rate.
The hamster process pulling from the ring may just be load balancing to a herd of other threads operating under less stringent conditions, so they might be permitted i/o.
The idea is that the average time to process a packet absolutely must not exceed the average arrival rate, and the sum of spikes in arrival rate must average out over the size of the ring buffer to less than the process rate.
The hamster process pulling from the ring may just be load balancing to a herd of other threads operating under less stringent conditions, so they might be permitted i/o.