Hacker News new | ask | show | jobs
by Moral_ 1401 days ago
A lot of the reasons they've had to build most of this stuff themselfs is because they decided for some reason to use freeBSD.

The NUMA work they did, I remember being in a meeting with them as a Linux Developer at Intel at the time. They bought NVMe drives or were saying they were going to buy NVMe drives from Intel which got them access to "on the ground" kernel developers and CPU people from Intel. Instead of talking about NVMe they spent the entire meeting asking us about howt the Linux kernel handles NUMA and corner cases around memory and scheudling. If I recall correctly I think they asked if we could help them upstream BSD code for NVMe and NUMA. I think in that meeting there was even some L9 or super high up NUMA CPU guy from Hillsborough they some how convinced to join.

The conversation and technical discussion was quite fun, but it was sort of funny to us at the time they were having to do all this work on the BSD kernel that was solved years ago for linux.

Technical debt I guess.

4 comments

Netflix tried Linux. FreeBSD worked better.
It's hard to believe in 2022, Google, Amazon, FB etc .. all use Linux, all CDN use Linux as well, and some services serve even more traffic than Netflix ( Youtube ). BSD faster than Linux is a myth, the fact that 99% of those run on Linux means more people worked on those problems means it's most likely always faster.

The funny thing is the rest of Netflix runs on Ubuntu, only those edge CDN runs on BSD.

Disclaimer: SRE at Google on a team vaguely related to video CDN stuff, but have no inside knowledge

I don’t think you can dismiss BSD faster than Linux (or make any claims about the relative speed of different OSes) just because big companies run Linux. There are other costs involved and optimisations that can be shared if your edge serving stack is as similar as possible to the non-edge serving stack (that you have many more engineers developing for).

All you can conclude is that with enough optimisation, Linux can be made to perform well enough for it to not be worth replacing (yet). Because replacing Linux would require replicating all the custom software and optimisations made to it for whatever other platform you pick.

You'd be surprised how many businesses run FreeBSD and keep it a secret as a competitive advantage.
*At the time when they created the OCA project.

If someone was going to do a similar comparison now the results could be different.

By some definition of better.
It worked faster. It's a common misconception among newbies that "Linux has NUMA" automatically means it will use NUMA properly in a given workload. What it actually means is you _should_ be able to use existing functionality. Sometimes you'll only need to configure it, sometimes you'll need to reimplement it from the scratch, and doing that in FreeBSD is easier because there's less bloat.
I still don't get the NUMA obsession here. It seems like they could have saved a lot of effort and a huge number of powerpoint slides by building a box with half of these resources and no NUMA: one CPU socket with all the memory and one PCIe root complex and all the disks and NICs attached thereto. It would be half the size, draw half the power, and be way easier to program.
This is a testbed to see what breaks at higher speed. Our normal production platforms are indeed single socket and run at 1/2 this speed. I've identified all kinds of unexpected bottlenecks on this testbed, so it has been worth it.

We invested in NUMA back when Intel was the only game in town, and they refused to give enough IO and memory bandwidth per-socket to scale to 200Gb/s. Then AMD EPYC came along. And even though Naples was single-socket, you had to treat it as NUMA to get performance out of it. With Rome and Milan, you can run them in 1NPS mode and still get good performance, so NUMA is used mainly for forward looking performance testbeds.

Modern CPUs like the AMD EPYC server processor are "always NUMA", even in single-socket configurations!

They have 9 chips on what is essentially a tiny, high-density motherboard. Effectively they are 8-socket server boards that fit in the palm of your hand.

The dual-socket version is effectively a 16-socket motherboard with a complex topology configured in a hierarchy.

Take a look at some "core-to-core" latency diagrams. They're quite complex because of the various paths possible: https://www.anandtech.com/show/16214/amd-zen-3-ryzen-deep-di...

Intel is not immune from this either.Their higher core-count server processors have two internal ring-bus networks, with some cores "closer" to PCIe devices or certain memory buses: https://semiaccurate.com/2017/06/15/intel-talks-skylakes-mes...

If you are buying servers at scale the costs will certainly add up vs. buying two processors. If you buy single proc servers, that is double the amount of chassis, rail kits, power supplies, power cables, drives, iLO/iDRAC licenses, etc.
You can build motherboards with two or more completely isolated sets of CPU and memory, that are physically compatible with standard racks etc.
Good point, I forgot about those. It would be interesting to see if 1x PowerEdge C6525 with four single processor nodes is cheaper than 2x Dell R7525 servers. The C6525 does support dual processor, so it does seem a bit wasteful to me.
Can you buy non NUMA mainstream CPUs though? Honest question because I’d love to be rid of that BS too
NUMA is an outcome of system configuration. You can make a non-NUMA platform using any CPU. You just limit yourself to 1 CPU socket.

Here's a Facebook engineering blog post about how they left NUMA behind. https://engineering.fb.com/2016/03/09/data-center-engineerin...

> You can make a non-NUMA platform using any CPU. You just limit yourself to 1 CPU socket.

Well, not on Epyc generation 1. Those have four NUMA segments in each socket.

Also those Xeon Platinum 9200 processors Intel made as an attention grab.

EPYC Naples wasn't good for much of anything though, so I am trying to forget it.
Is NUMA a solved issue on Linux? Correct me if I am wrong but I was under the impression it may be better handled under certain conditions, but NUMA, the problem in itself is hardly solved.
Maybe Brendan Gregg can further enlighten his new coworkers at Intel why Netflix chose both AMD & FreeBSD.