Hacker News new | ask | show | jobs
by tatersolid 3140 days ago
Netflix uses BSD for OpenConnect because asynchronous disk I/O-which is critical for a CDN-remains a tire fire on Linux after more than 20 years.

On Linux you basically have to use blocking threads to emulate async disk I/O, which means tons of threads and overhead when you’re handling 10k-100k concurrent connections per box.

3 comments

This is incorrect. Linux has had proper direct async disk I/O for a decade or more, used ubiquitously in database engines (among other things). It is not emulated with threads.
Last I looked (~4.4) linux AIO implied DIO. Conflating aio and dio is the problem, not a feature. On FreeBSD AIO works with the page cache for read and write, read ahead works, sendfile works, io & cache & readahead & size hints all work. Linux has half of those, and DIO none. As i recall.
Bingo. Async disk IO on Linux has to be unbuffered and block aligned. Making it useful only for databases that manage their own caching, and useless for file systems.

A video CDN needs lots of concurrent access to a file system.

That's not really true anymore. Linux AIO works fine on XFS.

http://man7.org/linux/man-pages/man2/io_submit.2.html

Isn’t that still experimental?

From the current aio man page:

“Work has been in progress for some time on a kernel state-machine-based implementation of asynchronous I/O (see io_submit(2), io_setup(2), io_cancel(2), io_destroy(2), io_getevents(2)), but this implementation hasn't yet matured to the point where the POSIX AIO implementation can be completely reimplemented using the kernel system calls.”

Not true. I've regularly demonstrated very high throughput/connectivity with lots of little connections. The problem I have seen (not only with linux) has been over-aggressive congestion controls, usually configured/set wrong.

On high performance async IO, this works quite well in Linux, and there are no blocking threads that I am aware of in that stack. The kernel uses bio dispatches to perform the actual block io. If you are complaining about using bio to perform the actual IO, and that linux includes this in its load calculation, sure, that is a conscious decision as I understand it on the part of the block layer folks. Is it wrong or bad? I don't think so, though others have different opinions.

FWIW ... I work at a place now using SmartOS as its primary OS. There are many people I know preferring BSD. Many people preferring linux. I have a different view, one that is not as popular as I hope it would be.

Specifically, I look at operating systems now, largely, as an implementation detail for your stack. You have a mission in many cases, unless you are an OS developer, that consumes the OS services layers to help you perform your mission. In many cases, specifics of the OS don't matter, as long as they don't get in your way. Sometimes the specifics of the OS help you.

From my view as an HPC guy, a hardware guy, a storage/compute/ML/GPU guy, I generally can work in Linux and BSD without pain. Minor config difference, but I am comfortable in both.

I am not, and have not been comfortable in AIX, HP/UX, and UnicOS. I used to enjoy IRIX until I started playing with Linux. I used Solaris and SunOS in the past, and SmartOS/illumos today.

As long as the OS has the tools I need, the libraries I need, or a way for me to build them, and doesn't constrain me or force me to contort to vagaries of the OS itself, I am fine with it.

A problem arises when people get caught up in "my OS > your OS", which, this overall question at least brings in under the covers. This usually comes around from various esoteric aspects of little relevance for the vast unwashed masses of users (like me). On the OS dev side, when this happens, it is usually defensive because something needed is missing, or some OS dev/manager (mis)believes that users don't actually need the features they are requesting.

That is actually a major problem, and it tends to drive people from your platform. Users aren't dumb, and there are many sophisticated people who have a deeper appreciation for the issues, than "my OS > your OS".

Why *BSD isn't used might be for historical reasons, momentum, etc. It is perfectly fine as an OS, and quite usable for HPC. Similar for illumos/SmartOS (not simply saying that as I work for a company using SmartOS). There are missing things in both of these, and I am working (on the side) to try to help SmartOS get some of these things (user space stuff). FreeBSD in particular has most of what is needed.

Basically pick the system that works for you and your users. The OS, as I noted, can be viewed as a detail of the implementation. Or not.

But its not a reason to create friction/tension between groups claiming OS1 > OS2 ...

The VI/Emacs wars are so 80s/90s ...

I think the issues tatersolid has with linux aio is implicit dio. Thats really painful if youre working with hdd or high concurrent read scenarios. See my sibling comment for why.

That leads to people implementing “async io” threadpools in userland. Those threads then do “regular” blocking io which is able to use the page cache etc. having hundreds or thousands of blocking IO threads then causes lots of other perf/scheduling issues.

It’s not just “my OS > your OS”: SmartOS is bulletproof when it comes to correctness of operation, data integrity and superior ease of system administration, which most prominently manifests itself in less breakage, non-existant problems caused on Linux by techology concepts from the ‘80’s of the past century and nights slept through instead of being in conference calls with clueless managers screaming at one at 01:13 in the morning. These were all issues I have and have had with Linux which I don’t have with SmartOS. That’s a big difference!

An OS is a priori better if I get to sleep through the night without an incident.