Hacker News new | ask | show | jobs
by farazbabar 2740 days ago
In traditional I/O, a hardware interrupt is triggered whenever data arrives at hardware boundary and the interrupt can get serviced by any core that is available to the scheduler. One can imagine how much overhead is involved in context switching whatever that core was doing before, setting up the registers, moving data and then relinquishing the core back to OS - in this model, dedicated cores serve I/O in a memory mapped ring buffer like data structure sized to your application needs. There is no allocation/deallocation overhead, no management beyond moving a pointer and no context switching. If you can spare the cores, this can significantly improve performance.

In one use-case, I was able to quadruple the performance on a 32 core xeon by installing 4 10gbps ethernet cards and dedicating the first eight cores to I/O (2 per interface). This is all about latency but with proper care, it also improves throughput.

1 comments

Do you have to write your own software to do this, or can it be accomplished through OS configuration?