Page Comparison

...

the bootloader code
two identical bootloader config pages
the application code
(also: a section for the calib flash page, should be in the normal linker script too)

The two config pages are for redundancy. We will use the persist module to manage storing the config blob in those pages. A CRC (cyclic redundancy check - a quick hash function) will be stored along with the blob to ensure its integrity; if one page has an invalid blob, we overwrite it with the other one. This way we always have a valid config page, since fixing invalid config would otherwise require manually reflashing the bootloader onto the board.

...

Code Block

ID   Name      Current Project     Info   Git Version
5    newton    centre_console             f8df7d2-clean
2    galileo   bms_carrier                23daff3-dirty
8    maxwell   power_distribution  front  6a4a7bb-dirty
11   einstein  steering                   bc869ef-clean
7    curie     power_distribution  rear   c6e8925-clean
16   hawking   mci                        798fe65-dirty
4    faraday   pedal_board                5131f78-clean
9    turing    charger                    9e6987b-dirty

(this might be even more useful if we added branch namesThe client can also use the git command thing to look up the branch name from the commit hash and display it automagically!)

This command can also be used to implement pattern-matching for all of the following commands. To implement pattern-matching, all that’s technically required is to get a list of IDs that match a pattern, but the extra information can be used to display a warning before potentially-dangerous commands like flash, or just a list of the boards that a command applies to, like this:

...

Datagram protocol version (1 byte) - a constant, initially 0x00. Versions that don’t match should be silently ignored. Useful for backwards compatibility in the future.
CRC32 of the whole datagram after this point (4 bytes)
Number of node IDs / controller board IDs addressed (n) Datagram type ID (1 byte) - the an ID specifying what the datagram is and how the data field is formatted, like a command ID. Sort of like a babydriver ID.
Number of node IDs / controller board IDs addressed (n) (1 byte) - the special value 0 means every controller board / node on the network should receive the datagram.
List of node IDs (n bytes, 1 byte per node ID)
Data size in bytes (m) (2 bytes) - this value could physically go up to 65536, but the STM32F072 only has 16KiB of memory, which has to hold all of the data plus other stuff on the stack, global variables, etc. So, the data size MUST be less than or equal to 2048 bytes (2KiB). This is an arbitrary limit which is subject to change, but this value lets us transfer a whole 2KiB flash page in one datagram.
Data (m bytes)

Note: all multi-byte integers (i.e. the CRC32 and data size in bytes) are in little-endian order, with the least significant byte first. (This is the default on STM32.)

Datagram messages will have the node ID (controller board ID) of the source node as part of the message’s arbitration ID, so the source of each message is identifiable. Thus multiple datagrams from different sources may be sent at the same time. Since the bootloader protocol is operating under a master-slave command-based architecture, the controller boards need only store and take action on messages from the client (with special ID 0), while the client must store datagrams from every controller board.

...

This scheme has the disadvantage that the node ID is at the beginning, so bootloader datagrams from controller boards aren’t given a higher priority in general. However, the client (which is the only party broadcasting extremely long and important datagrams like flashing content) has the special node ID 0, so the client’s non-starting datagram messages have the highest priority on the bus (all zeros) while the client’s starting messages have close to it - in our setup, bested only by the BPS heartbeat.

We will have to work around an x86 thing: see line 262 in x86/can_hw.c.

Under this scheme, code flashed via the bootloader can coexist with code flashed the traditional way, but the scheme does require that we reflash all boards so that every node on the network is aware of and at least ignores bootloader messages.

...

One other topic: we should have a way to jump from the application code back to the bootloader via a CAN message to run more commands. This is a peripheral feature since we can just power cycle the system. We can either do it upon receipt of any bootloader datagram start message (and pass the start message back to the bootloader), or we can do it with a normal CAN message with a handler pre-registered. In any case, we’d have to initialize CAN in smoke tests and small projects in order for them to be accessible via this method.

Way to do this without starting up into bootloader on initialization: https://www.st.com/resource/en/application_note/dm00230416-onthefly-firmware-update-for-dual-bank-stm32-microcontrollers-stmicroelectronics.pdf. Two boot banks.

Client and API

The core of the client should be implemented as a modular Python script so that it can be deployed not just on x86 but also on e.g. a raspberry pi in the car to enable over-the-air firmware updates. The make interface to the bootloader client can be implemented as just shelling out to the Python script.

...

Version	Old Version 2	New Version Current
Changes made by	Ryan Dancy	Ryan Dancy
Saved on	Jan 19, 2021	Jun 12, 2021

Versions Compared

Key

Client and API