2021-11-17 Firmware design meeting
Let’s build firmware that works.
Goals:
Make it easy to test and validate.
Build safe and working firmware.
Topics for today: FreeRTOS, redesigning CAN + build system + testing
FreeRTOS
Core library status with FreeRTOS.
Library | Redesigns needed? |
---|---|
GPIO, bcd/chip_id/other atomic subroutines | Not at all |
Soft timers, delay (maybe), wait, FIFO | FreeRTOS handles it for us |
GPIO interrupts | Minimal |
SPI, I2C, ADC (easier), PWM, watchdog | Changing to interrupt-driven and to use FreeRTOS/locking |
CAN | Extensive, complete redesign |
Initial testability improvements: SPI, I2C should be updated to not use mocking.
We want to scrap the event model and use tasks instead. (Could have a different form of event model where tasks subscribe to be woken on events → FreeRTOS has an event model, look into this!)
FSMs:
Mapping states to tasks, or mapping FSMs to tasks. One FSM could correspond to one task. (Could this be opt-in?)
How to signal a task to change the current state? (Queue-based, wrapped up in an FSM library method?)
Dedicated senders and receivers for each task.
We should support more data buffering → tasks should buffer their own data rather than reading shared data, which can get corrupted. (FreeRTOS queues do this (pass by value/copy) unless you pass a pointer.)
Static memory allocation
Each task has a pre-allocated stack (allocated by FreeRTOS). Allocate everything at the start.
We can support dynamic allocation, but static allocation lets us know what our footprint is, and is also MISRA-compliant.
FreeRTOS can detect if a task exceeds its static memory allotment. Dynamic allocation with FreeRTOS is a bit safer than C malloc(), there’s a bunch of schemes for the heap (based on object pools).
CAN
Current implementation: x86 CAN attempts to mimic hardware and uses RX and TX threads communicating with sockets.
CAN_TRANSMIT_* emits a TX event, CAN puts that on the TX queue to the TX thread, TX thread TXes it, RX thread does the whole thing in reverse.
Will have to redesign almost entirely from scratch (other than hardware).
CAN FD → a feature the new implementation might support.
We should make sure to support extended CAN IDs.
Design:
Might consider having CAN messages sent and received on fixed intervals, rather than in real time and with interrupts.
Don’t want low-priority CAN messages constantly interrupting.
Might want to unpack CAN messages all at once and not right when receiving.
Testability (of all libraries)
Mocking: should go. Doesn’t simulate SPI/I2C/etc correctly (for example).
More strict unit test structure (coverage?).
Simulation: properly model the SPI/I2C protocols on x86 for testing. Could use Proteus/Simulink/etc, or use the embedded firmware more directly.
Validation: the DRI (person responsible) for a project should be the one validating it, so they keep validation in mind, as much as possible. (Will be more possible with COVID.)
We should have a model of how validation and unit testing should work. Makes peer review easier, so we can do more of that.
E.g.: FSM-like structure. Given, when, then.
Definition of done includes a validation plan and a smoke test plan.
Timelines (and other considerations)
We can replace the build system in parallel with other things.
For the dev environment, look into Docker → if we’re changing the CAN system then the vcan issue might be as much of an issue.
Change the “x86” terminology…
Timelines
Working backwards: validation complete November 2023. Based on MSXIV pace, probably 8 months for full validation, but could do in 3-4 months with more people assuming hardware availability. (Removing the project/validation separation should reduce timelines.)
We should focus on system design for a year or so to create a robust system, then new members can write projects. Goal to have a system complete in Sept 2022.
Hopeful to finish bootloader/etc by the new year.
Allocated team of people for the CAN overhaul, for core libraries.
This isn’t a lot of time!