Hacker News new | ask | show | jobs
by rleigh 1179 days ago
No, the consistency of the timing is terrible on Linux.

Seriously, stick a scope or logic analyser on e.g. an I2C line and look at the timing consistency. Even on specialised kernels for realtime use, you can have variable timing delays between each transaction on the bus. And this is all in-kernel stuff that's inconsistent--it looks like it's getting pre-empted during a single I2C_RDWR transaction between receipt of one response and sending of the next message. The actual transmission timing under control of the hardware peripheral is really tight, but the inter-transmission delays are all over the place. Compare it with an MCU where the timing is consistent and accurate, and it's night and day.

1 comments

The parent comment says

> control a mechanism and reliably react on a deadline of a few ms

I actually did measure this with an oscilloscope on embedded Linux (not a raspberry pi). A PPS signal was fed into Linux, and in response to the interrupt Linux sent a tune command to a radio. Tuning the radio itself had some unknown latency.

End-to-end, including the unknown latency of tuning the radio, I never observed a latency that would even round to 1 ms. That's unpatched and untuned Linux, no PREEMPT_RT. I didn't dig any further because it met our definition of "reliable" and was well, well within our timing budget.

I'll be the first to admit it wasn't some kind of rigorous test, just a casual characterization. I would not suggest anyone use Linux for a pacemaker, airplane flight controller, etc.

This is making me itch to buy an oscilloscope and run some more thorough tests. I'd like to see how PREEMPT_RT, loading, etc changes things.

My profiling was on an NXP i.MX8 MPU, which is a A-profile quad core SOC very similar to an RPi. I think it was with a PREEMPT_RT kernel, but I can't guarantee that, but I was fairly shocked at the lack of consistency in I2C timing when doing fairly trivial tasks (e.g. a readout of an EEPROM in a single I2C_RDWR request). You wouldn't see this when doing the equivalent on an M-profile MCU with a bare metal application or an RTOS.

What is acceptable does of course depend upon the requirements of your application, and for many applications Linux is perfectly acceptable. However, for stricter requirements Linux can be a completely inappropriate choice, as can A-profile cores. They are not designed or intended for this type of use.

Profiling this stuff is a really interesting challenge, particularly statistical analysis of all of the collected data to compare different systems or scenarios. I've seen some really interesting behaviours on Linux when it comes to the worst-case timings, and they can occasionally be shockingly bad.

I was referring to that yes, even if Linux performs well in the ideal case, it's not necessarily reliable, and the possible problems are hard to compensate for.

Eg, your process can randomly get stuck because something in the background is checking for updates and IO is being much slower than usual, or the system ran out of RAM and everything got bogged down by swap.

On a microcontroller you just don't have anything else running, so those risks don't exist. Eg, a 3D printer controls a MOSFET to enable/disable the heaters. The system can overheat and actually catch on fire if something makes the software get bogged down badly enough. On a Linux system there's a whole bunch of stuff that can go wrong, most of which is completely outside the software you actually wanted to run.

I guess I feel like things are a bit tangled up here.

Sure, a single purpose MCU controlling a heater MOSFET has a lot fewer failure modes than a Linux device doing the same.

I don't dispute there are a lot fewer ways it's even possible for that system to misbehave.

The original comment was recommending ESP32s over Raspberry Pis for DIY projects like opening your curtains or flashing LEDs. The ESP IDF runs on FreeRTOS, so we're already moving away from the bulletproof single task MCU. People will almost certainly be adding some custom rolled HTTP webserver on top. They might be leaking memory all over the place, there are probably all kinds of interrupts they have no idea about firing off in the background. I wouldn't trust an ESP32 curtain-bot not to strangle me any more than I'd trust a Raspberry Pi based one.

Your example about running out of RAM seems just as relevant to MCUs. You can leak memory and crash an MCU. You can overload an MCU with tasks and degrade performance. You can use cgroups or ulimit to help prevent a bad process from bringing Linux down.

I agree that Linux is not going to be as reliable as going baremetal, and I'm not recommending you use it as a motor controller. But even the most reliable MCU can fail. An MCU can get hit by cosmic rays or ESD. People might spill water on the 3d printer or physically damage it. It's not even a binary "works right or dies" thing. I've voltage glitched MCUs to get them to skip instructions and get into an unanticipated state.

In any case, the best path to safety is to imagine that the computer might be taken over by Skynet and do everything in its power to kill you. Or worse, ruin your print. If safety is the goal it's probably best to achieve through requiring the computer system to take some positive action to keep the heater on. Or even better, a feedback safety mechanism like a thermal fuse.