|
|
|
|
|
by AlotOfReading
455 days ago
|
|
Most companies don't do this because it's not one of their organizational priorities to have reliable updates. The infrastructure is usually custom built and maintained by a couple of folks who have a dozen other responsibilities they're told are more important. Testing is usually limited by hardware availability and release velocity. "One of every board revision we've ever produced" simply isn't available and waiting two days to run through every firmware version before you release updates is a conversational non-starter with the PMs. There are commercial offerings (like mender.io, never used) that basically specialize in providing rock solid update infrastructure, but that again takes investment and organizational priority that doesn't exist for non-feature code. |
|
I'm trying to buck the trend though and on the new embedded system I'm working on, I've specifically designed the upgrade system to be as reliable as I can make it. It goes something like this:
- The new firmware is downloaded to the secondary application slot.
- Just prior to rebooting, the entire state data of the system is serialized as a document and stored on a flash partition.
- The upgrade flag is set, the system reboots and MCUboot does its thing.
- The new firmware finds out a upgrade happened, clears out all the data partitions, restores from the document and then clears out its partition.
The system is basically sanitized and restored after each upgrade. It's also the same codepath that handles saving and restoring the system's configuration by the end-user as well as settings management. If the document schema is for an older version, run the N-to-N+1 schema upgraders on it prior to applying instead of trying to patch the system in-place. If something goes horribly wrong, flip a jumper to trigger the heavy-duty sanitization that nukes the entire external flash (internal flash only contains the bootloader, primary application slot and factory parameters so it's essentially read-only once the application boots).
It might be hubris, but I hope it's good enough that I'll never see a bricked card that can't be resurrected by a factory reset with this project (assuming no hardware damage, no internal flash corruption and no bricking firmware getting signed with production keys seeping through the cracks despite all the checks in place).