It’s very easy to forget field upgrades of device firmware when you are designing embedded systems. As a developer you are probably flashing test firmware directly into a device, and it is awfully easy to slip into the mindset that once you have finished development the firmware will never need to be changed again. This is, of course, not even remotely true.
Before we look at the methods of getting new firmware into your device once it is out in the world, we should look at how that firmware runs in the first place. Typically the available permanent memory in the device (for simplicity let’s suppose it is flash) is split up into three or more areas or “partitions”. The first of these, the one executed at the start of time from the reset vector, is the bootloader. Often this will be the excellent U-Boot (https://u-boot.org), particularly when the firmware is a fully-fledged Linux system, but some manufacturers’ build systems and support libraries provide (and insist on using) their own bootloader. It’s not hard to “roll your own”, and this can be a good idea when you don’t need the flexibility or security of something like U-Boot. Some processors even have a small bootloader in ROM, though these are typically extremely limited in what they can do.
The bootloader’s job is to find the appropriate real firmware image to run, which is where the other partitions come in. Each partition may contain an image, prefixed with versioning and validation information. Absent any other instructions, the bootloader will check through the partitions, select the firmware with the highest version number that passes all the validation checks, and start running it as if that partition was what had been called from the reset vector. Ideally the real firmware never needs to know that the bootloader intervened.
For Linux-based systems, partitions often come in pairs; one for the kernel and one for the file system. For our purposes we can treat them as a single unit.
This process can be pre-empted if you want to boot from an external source, usually via serial or USB connection. This is convenient for development, particularly when you are likely to have a serial debug connector in place anyway, though programming via JTAG will usually be faster. However neither of these are methods you are likely to want to use for field upgrades.
The basic mechanism of upgrading is simply one of taking a new image that has somehow been delivered to your device and putting it into a currently unused partition. There are several ways of doing this. Simplest is to suspend the entire running firmware and invoke the bootloader. This is the most practical solution if you or your customers have direct access to the device and can copy the new image to the bootloader somehow. U-Boot supports DFU (Device Firmware Upgrade) through USB access for exactly this purpose, and we have written custom serial downloaders to do the same thing. Typically this approach works best for products that are not network connected and which will not have their firmware upgraded often. Engine control units in cars, for example, are only ever upgraded by a service engineer.
If your product is part of the Internet of Things or is otherwise network connected, the obvious way to proceed is to use that network connection to push a new image to the device. This absolutely must be secured properly. Network security for embedded devices is a complex subject, and the pros and cons of various approaches are constantly changing. For example, using TLS certificates with very long lifetimes has been a popular approach in the past, but the recent decision of the CA/Browser Forum to slowly reduce the maximum lifetime of a certificate to 47 days makes it a lot less attractive.
Assuming you have a secure mechanism for downloading and new firmware, you then have to decide whether you wait until you have a full valid image before writing it to flash or writing it piece-by-piece. Despite the obvious potential problems with faulty downloads, the latter approach is often the best bet. Firmware images are frequently larger than the amount of RAM that can be set aside to download them into. The trick here is to ensure that your partitions have a “valid” marker that you guarantee will not indicate the partition is valid until you explicitly set it. Exactly how this is most easily done will depend on how the flash memory in your device behaves.
There is a strong argument for keeping this upgrade code in the bootloader, or at least that part of it that writes to flash and validates the images. The bootloader will need to know the validation method anyway in order to select a partition to boot from. The bootloader should also be some of the best tested code in the system, being relatively small and self-contained, making it less likely that bugs will creep into the upgrade system.
That’s a brief summary of how you can get downloaded firmware into your device, but how does it know there is firmware to download? There are a variety of methods depending on how smart your device is and how much user interaction it has. The basic mechanism usually involves querying some central resource, for example a field of an AWS device shadow, to see if something new is available. The device can then retrieve (“pull”) the new firmware from a known (nominally) safe location and apply it as appropriate. Security is obviously a primary concern, and as previously mentioned that is something of a moving target in the embedded world.
An alternative approach involves “pushing” the upgrade, actively sending it to the device rather than passively waiting for the device to ask. This approach makes sense in many circumstances, particularly where a device has no real user interface, but has its own security risks. The device needs to be absolutely certain that upgrade information is coming from a trusted source. Again this is a complex subject with many disagreements about the best method of securing the connection, and we don’t propose to go into any detail here.
We have tended to favour “pull” upgrade systems for connected devices, largely because the process is straightforward to manage and gives the embedded device the flexibility to upgrade when it is not otherwise busy. Whatever method you choose, you should plan out roughly how it is going to work early in the design stage. Every mechanism will have its own costs and benefits, some more obvious than others, and the sooner you have a specification the sooner you can evaluate those costs and benefits. It may even help to bring home the point that these are ongoing costs, ones that will have to be met for the lifetime of the device.
In-field upgrades are a subject that often gets very little consideration at the start of the project. Don’t push them aside like that; give the system some proper thought and you will have far fewer nasty surprises as development goes on.
