Hacker News new | ask | show | jobs
by nkurz 4862 days ago
He suggests an interesting approach.

1) Tell the kernel it only has a limited set of cores to work with.

The way to fix Snort’s jitter issues is to change the Linux boot parameters. For example, set “maxcpus=2”. This will cause Linux to use only the first two CPUs of the system. Sure, it knows other CPU cores exist, it just will never by default schedule a thread to run on them.

2) Manually schedule your high priority process onto a reserved core.

Then what you do in your code is call the “pthread_setaffinity_np()” function call to put your thread on one of the inactive CPUs (there is Snort configuration option to do this per process). As long as you manually put only one thread per CPU, it will NEVER be interrupted by the Linux kernel.

3) Turn off interrupts to keep things as real time as possible.

You can still get hardware interrupts, though. Interrupt handlers are really short, so probably won’t exceed your jitter budget, but if they do, you can tweak that as well. Go into “/proc/irq/smp_affinity” and turn of the interrupts in your Snort processing threads.

4) Profit?

At this point, I’m a little hazy at what precisely happens. What I think will happen is that your thread won’t be interrupted, not even for a clock cycle.

Can anyone remove the haziness? I'm more interested in this for benchmarking than performance, and wonder how it compares to other ways of increasing priority like "chrt". Is booting with a low "maxcpus" necessary, or can the same be done at runtime?

3 comments

I think he has some iffy premises. I'm not certain, but I guess pthread mutexes probably use userspace mutexes under the covers. If they don't you could select a library that does. Grabbing uncontended mutexes should be basically free. It is definitely not a good idea to write all your own abstractions, as he suggests, unless you have a really great understanding of the issues involved. You'd probably still get it wrong sometimes even with that.

Affinity is definitely valuable, but I don't think you should need to disable interrupts for most applications. I'm not even sure if it is generally possible, since interrupts are sometimes used to swap pages or notify the kernel that a page isn't mapped or that a blocked resource is now available. The reason affinity is valuable is not because of kernel interactions. It's because of NUMA and cpu cache swapping. Affinity can prevent thread migration, which is expensive mainly because data also has to be migrated or else accessed in a less efficient manner. Likewise, make sure that if you dispatch an asynchronous call, the handler runs on the same core you sent the call from.

Finally, it's a common fallacy in these kinds of posts to act as if threads can't be used to do shared-little or shared-nothing-style multi-programming. They often aren't, but there's nothing that prevents it.

I'm not certain, but I guess pthread mutexes probably use userspace mutexes under the covers.

Correct. man 7 pthreads states:

In NPTL, thread synchronization primitives (mutexes, thread joining, and so on) are implemented using the Linux futex(2) system call.

A futex is a mutex that only does a system call if there is contention; you can google it.

"Fuss, Futexes and Furwocks: Fast Userlevel Locking in Linux": http://kernel.org/doc/ols/2002/ols2002-pages-479-495.pdf
My blog post was quite clear that I'm talking about futexes, and then when it fails to get a lock that it goes into the kernel. Seriously, that's how everyone did it from Solaris to Windows before Linux "invented" the concept AND it's been in Linux for a decade. When somebody says "mutex", you have to assume they already mean "futex".
If you fail to get the lock, then you have to wait on the lock, so you might as well go into the kernel anyway. That's the reason futexes do that, instead of just letting the thread spin.

True, there are scenarios when you might prefer to spin, but they're pretty specialized.

My blog post was quite clear that I'm talking about futexes

No, you never mention the word "futex."

When somebody says "mutex", you have to assume they already mean "futex".

I don't think that's a valid generalization at all.

The entire point of the post was talk about "lock-free" algorithms where two cores can make forward progress without either having to wait or spin.

What systems have mutexes that aren't built like futexes?

Kernels, and probably any kind of parallel user-level runtime like Erlang.
> I don't think that's a valid generalization at all.

Especially given he was complaining about syscall costs . . . which are nonexistent with futexes . . .

They are NOT non-existent, they happen a lot in contention. That's why code fails to scale: as you add CPUs, contentions happen a lot more often, and the number of syscalls shoot through the roof.
If there is contention, I think you mean, right?
Yeah, edited to fix it. Just a typo.
Setting a thread (or process) to a higher real-time priority than everything else (i.e., with chrt) will accomplish the same thing he's describing with maxcpus (though I wasn't aware maxcpus worked the way he said, so I can't vouch for it myself).

It's also just better, because something lower priority can run whenever the thread with real-time priority suspends (e.g., waiting for disk or network I/O).

I don't know the process for making sure interrupts don't happen on a particular CPU in Linux, but if you do that, you can only be interrupted by SMIs (system management interrupts), which are done by the BIOS, not the OS. Some machines don't have SMIs, though. It's BIOS specific.

If you do have them, you can't necessarily tell by using the TSC (as the author wanted to do), because poorly-behaved BIOSs will attempt to hide that the SMI happened by fiddling with the TSC.

> will accomplish the same thing

I haven't worked with chrt this way, but I have done the maxcpus approach mentioned in the article - with the explicit affinity combined with maxcpus you can be sure that a process/thread won't jump around to another free CPU if it goes out to disk and is then replaced with another process. From my cursory reading of chrt, I don't see anything about CPU affinity, so this is still a possibility, right? Am I missing a constraint in the Linux scheduler?

I suppose they're the same in that you can use both to basically tell the kernel to "not schedule anything that would interact with a give process in any way that would effect it's execution timeline"

maxcpus needs to be set at boot time. You can turn CPUs on/off logically at runtime but that's not equivalent to "take this CPU out of the default scheduler rotation", it's equivalent to "power it off, I may unplug the entire CPU and swap it out". The latter would seem fancier than the former, so not sure why there isn't the means to set maxcpus at runtime, but there doesn't appear to be.

The author does gloss over the fact that if you have a hyperthreading enabled processor, you want to make sure you either turn it of or don't schedule a pair of processes in different virtual cores on the same logical core (hyperthread pair), because you will see a slowdown because the two virtual cores are actually sharing boatloads of state.