Hacker News new | ask | show | jobs
by jalons 4018 days ago
Tools. Compatibility. Resembles systems already in place (i.e., back when BSD failed to get SMP support "soon" there wasn't an option if your workspace was CPU bound). Numbers (it's popular in HPC/HFT because it's popular).

We're not consuming/using the built in network stacks anyway, we're using the OS as a content delivery system. Get us something we can get to the cores on, we're going to pin our applications directly to those cores keeping the kernel relegated to scheduling tasks on whichever NUMA core is further away from the PCIe bus running into the CPU. We'll have the CPU's we're using pegged in a constant spinlock anyway which would make the scheduler think really, really hard about running tasks there. We don't use realtime kernels as it's better for us to pay the price on the occasional spike outlier than to raise the baseline latency up.

Due to my own unfamiliarity, I don't know what BSD's equivalent to isolcpus is. I don't know how to taskset on a BSD. I don't know if the infinibad/ethernet controller's firmware/bypass software works. I don't know how BSD's scheduler works (not that we usually care, but there are times when one can't avoid work needing to be scheduled, things like rpc calls to shut down an app, ssh if you can spare the clock cycles for key verification, etc).

Would dtrace come in handy? Most definitely. Is that enough for us to abandon what we know works? Not yet.

1 comments

This is probably the best argument: Familiarity.

However, we're talking about very specific needs: super high performance networking. If you have that specific of a need, wouldn't you want something unfamiliar if it solves the problem best?

If it's truly better and the only difference is remove Linux and install BSD then what is BSD doing that is better/different/messed up that the packets can flow faster in BSD than in Linux?
Talking about unfamiliarity and specific needs: FPGAs are much better suited than CPUs for processing minimum-sized frames at wirespeed. They can still forward all unhandled frames to a CPU. Yes, it's a lot of development effort compared to a CPU-only solution, but considering all the kernel-optimizing-multicore-cleverness from OP I would say we are approaching the break-even point.