Introduction to ARM Architecture

We use ARM for all of our microcontrollers as of 15/12/2024. ARM chips are becoming very prominent in our laptops, phones, and other devices. This page will cover a basic introduction to ARM chips with a heavy focus on the underlying chip design. It will not cover multi-core systems as we only use single-core systems.

What is ARM? ARM is a processor architecture that can be licensed to create different types of Cortex processors, such as Cortex-M33 and Cortex-M0.

Why know this? A good firmware engineer knows their hardware. This means knowing how our microcontroller operates at a fundamental level. I’ve also been asked what APB is by new members working on drivers.

 

WARNING: I don’t do chip design; this is ARM architecture from the perspective of a firmware dude who reads ARM documentation.

ARMv7

ARMv7: 32bit processor architecture designed for low-power electronics. Our STML433CCU6s are distant relatives of the ARMv7 processor, designed specifically for embedded systems. We come from the ARMv7-M architecture. Uses AArch32 assembly

Split into the following:

ARMv7-A: Application profile used for complex systems. For example OS, virtual memory, multi-core processors, etc. This has a 32-bit address space, so only 4GB at a time. Runs in 32-bit mode.

ARMv7-R: Real-time profile used in controls/industrial settings. Very deterministic and low-latency. Strictly 32bit.

ARMv7-M: Microcontroller profile for low-power embedded applications. Simplified instruction set and focuses on efficiency. Strictly 32bit.

ARMv8 (Optional)

ARMv8: 64bit processor architecture that supports both 32bit and 64bit mode. Operates in either AArch32 or AArch64 depending on the configuration during startup (Can be configured). Used in the Raspberry Pi. These chips have better security too! Like TrustZone. These guys also have double the registers!

Split into the following:

ARMv8-A: 64bit application profile that supports both 32bit and 64bit. This is what is used in Rasberry Pi 3/4/5. Supports OS, virtual memory, multi-core processors etc. Also has backward compatibility for ARMv7 software. This operates in 64-bit memory space so we have access to 2^64 unique addresses (18.4 exabytes where 1 exabyte is 1 billion gigabytes ). This is theoretical however, there are hardware limitations and software limitations (OS MMU) on memory.

ARMv8-M: Microcontroller profile using ARMv8. This has TrustZone security and supports only 32bit environments. (So does not have 64-bit instructions/memory space)

AArch64

ARM Assembly

Advanced Microcontroller Bus Architecture (AMBA)

This is a set of specifications provided by ARM to support on-chip communication protocols. These standards are used in System-on-Chips and our STM32s. The AMBA protocol is implemented in the hardware with VLSI (RTL languages). When working at the register level, it is good to be aware of these buses as it helps us navigate documentation.

There are 3 main buses, APB, AHB, and AXI (described in the next sections). The key differences between all of them are bandwidth, performance, features, power consumption, and complexity. Here is a simple table comparing all of them:

 

APB

AHB

AXI

 

APB

AHB

AXI

Bandwidth/Data rates

32-bit. Clock frequency ~40-100MHz depending on the selected IP. Our stm32l433ccu6 is configured for 80MHz. (LOWEST)

64bit - 1024bit. Clock frequency ~100-250MHz depending on the selected IP.

256bit - 1024bit. Clock frequency ~100-400MHz depending on the selected IP. (HIGHEST)

Performance

No pipelining. Every process takes at least 2 cycles. Synchronous.
Worst performance.

Pipelined. Middle child for performance.

Pipelined. Best performance.

Power consumption Hard to estimate wattage for varying applications

Optimized for low power consumption.

Middle child for power consumption.

Highest power consumption

Complexity

Most simple

Middle Child

Most complex

Typical applications

Peripheral interfaces (I2C, SPI, UART), Sensors, audio codecs. We encounter this bus the most in driver development.

Microcontrollers, memory access, DMA control. This is used in all STM32s.

Video rendering, audio processing, networking, graphics. This bus is for doing heavy loading.

Channels

2 Channels

  • Address channel

  • Data channel

2 Channels

  • Address/Data channel

  • Control channel

5 Channels

  • Write address channel

  • Write data channel

  • Write response channel

  • Read address channel

  • Read data channel

Documentation – Arm Developer

Advanced Peripheral Bus (APB)

APB uses 2 channels for all data transfer. They are as follows:

  • Address channel - Transmits the memory address for the data transfer

  • Data channel - Transmits the actual data

Advanced High-Performance Bus (AHB)

AHB uses 2 channels for all data transfer. They are as follows:

  • Address/Data channel - A single channel transmits both the memory address and data

  • Control channel - Transmits control signals such as read/write, burst info etc.

Advanced eXtensible Interface (AXI)

AXI uses 5 channels for all data transfer. They are as follows:

  • Write address channel - Transmits the memory address for write operations

  • Write data channel - Transmits the data for write operations

  • Write response channel - Transmits write response signals (Acknowledgements)

  • Read address channel - Transmits the memory address for read operations

  • Read data channel - Transmits the read data (slave → master)

Bursts, Transfers and Transactions

Before going into more detail, understanding this is very crucial.

A transfer is a single unit of data exchange. This consists of a handshake between the two devices.

A burst is a sequence of transfers counted as 1 unit. For example, 4 transfers can be considered your burst length.

A transaction is some number of bursts. If you have a transaction of 256 bytes, and your burst size is 4 transfers, and your transfer size is 4 bytes, This means you have:

256 bytes / 4 byte transfers = 64 total transfers

256 bytes / (4 byte transfers * 4 byte bursts) = 16 bursts.

Here is my crazy drawing

image-20241218-004336.png

Direct Memory Access (DMA)

Memory Barriers

Cache

Cache Coherency (More important in multi-core systems)

Debugging with JTAG/SWD

 

 

Resources:
AMBA Protocol: APB/AHB/AXI