| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by rayiner 2947 days ago

Generally, how it works is that there will be multiple hardware-managed ring buffers allocated in host memory, which will point to packet data which also is allocated in host memory. When a packet is received, the NIC will DMA the packet into host memory, update the descriptors in the ring buffers, and trigger an interrupt.[1] That is the point at which the kernel driver can access the packet (in host memory). See: https://www.intel.com/content/dam/www/public/us/en/documents... (section 8.3.3). By "in place" I mean it can be accessed from the location where the NIC put it.

In terms of scaling with multiple CPUs, generally these days the NIC will be able to manage multiple separate ring buffers. Ring buffers can be dedicated to specific CPUs, and the NIC can distribute packets among the ring buffers (in hardware) by hashing on the packet's IP 5-tuple.[2] So long as you have packets that are distributed evenly by that mechanism (not the case when, e.g. the packets all belong to the same connection), you can scale with multiple CPUs. (Presumably the NIC can perform the distribution process in hardware at line rate.)

[1] At high packet rates, there might be an interrupt less frequently than for every packet.

[2] See section 7.1.8 of the linked manual.