| If you are using C/C++ for any new app, there is a possibility you are writing code that has a performance requirement. - mmap/io_uring/drivers and additional "zero-copy" code implementations require consideration about byte order. - filesystems, databases, network applications can be high throughput and will certainly benefit from being zero-copy (with benefits anywhere from +1% to +2000% in performance.) This is absolutely not "premature optimization." If you're a C/C++ engineer, you should know off the top of your head how many cycles syscalls & memcpys cost. (Spoiler: They're slow.) You should evaluate your performance requirements and decide if you need to eliminate that overhead. For certain applications, if you do not meet the performance requirements, you cannot ship. |
People were understandably concerned that we had fucked up in the feasibility phase of the project. Lots of people get themselves in trouble this way, and this was a 9 figure piece of hardware sitting idle while our app picked its nose crunching data, if we didn't finish our work on time during maintenance windows.
But I was on my longest hot streak of accurate perf estimates in my career and this one was not going to be my Icarus moment. It ended being tweaks needed from the compiler writer and from Wind River (DMA problem). I had to spend a lot of social capital on all of this, especially the Wind River conference call (which took ten minutes for them to come around to my suggestion for a fix that they shipped us in a week. After months and months of begging for a conference call).