Hacker News new | ask | show | jobs
by petrosagg 2445 days ago
balena founder here. balenaOS does use the hardware watchdog available on the raspberry pi to detect CPU lock ups and automatically reset the board. On top of that we're also running software watchdogs that check the health of key system components and restart them if they become unhealthy.

It's true that SD cards are known for getting corrupt. balenaOS separates the partitions of the device and keeps the userspace in a readonly one while keeping mutable OS state in a separate partition. We are very conscious of writing as infrequently as possible to the SD card for this reason. The partition that accepts the most writes is the one holding the user container, which will get written to during an update, and also in case the user container stores any data on the device.

I'm aware that SD cards will internally swap blocks and don't really care about partition boundaries but assuming you're using an SD card with a well designed firmware it shouldn't lose a block during wear leveling.

That said, the SD card problem is one reasons we designed balenaFin :)

1 comments

I've personally seen many dozens of SD cards fail such that they don't even know their own sector count anymore. My suspicion is that in-progress wear leveling operations aren't rebust to sudden power loss (ejecting card without unmounting it). Most likely firmware specific...
Yes, and this problem is exacerbated by the raspberrypi's microUSB power input. We've seen countless number of cases where a Pi exhibiting SD card corruption was also experiencing undervoltage. If you're using raspberrypis in production it's paramount to have a good microUSB cable and power supply