Linker scripts

As we move into MS16, we may begin to use different STM32s, so understanding linkerscripts is essential in setting up flashing!

Introduction

Linking is the final step in compiling a program. It is responsible for organizing snips of assembly and merging them into a single program. It fills in all addresses so everything is in the right place.

Before linking, the compiler often leaves placeholder addresses because it does not know where the instructions will be in the broader structure of the program, it also does not know about external symbols (ie: dependencies and libraries).

The linker relies on the script, which acts as a blueprint to organize the program. 

Basic Structure

A linker script consists of 4 main sections:

  1. Memory layout (Where and what type of memory is available, ie: flash and RAM)

  2. Section Definitions (What part of a program should go where)

  3. Options (Command specifying architecture and entry points)

  4. Symbols (Injected variables during linking)

Memory Layout

To allocate memory for a program, the linker must know what memory is available and where. The memory layout section specifies the origin and length of memory sections, as well as the read/write permissions.

The name is arbitrary, ie: BOOTLOADER, APPLICATION, RAM.

This is followed by a memory section attributes. Memory can be writable (w), readable (r), or executable (x). Flash memory is often (rx) and RAM is often (rwx). 

Origin is the starting memory address. In ARM chips this can be 0x08000000.

The length is the size of the memory region, often specified in your device datasheet.

Section Definitions

Although there are no strict rules about sectioning code in the linker script, it is often recommended. It simplifies debugging as all related values are grouped. It also makes memory usage more efficient which is essential in embedded systems. 

 

  1. .text = executable code

  2. .bss = uninitialized data that should be zeroed at start-up

  3. .data = initialized data that is stored in RAM during run-time

  4. .isr_vector = the NVIC table

  5. .ro_data = read-only data, ie: constants

 

In embedded systems, it is crucial to specify the start and end addresses of each section. You will see in almost all examples there are start and end boundaries that are specified using _START_VARIABLE_ = .;
The . points to the current memory address.

 

The most important end/start variables in a linker script are:

 

  1. _etext = Specifies the end of executable code

  2. _sdata = The starting address of initialized variables in the RAM 

  3. _edata = The ending address of initialized variables in the RAM

  4. _sbss = The starting address of uninitialized variables in the RAM

  5. _ebss = The ending address of uninitialized variables in the RAM 

 

You will also see ALIGN(4) being used frequently. This is to ensure a 32-bit boundary on all memory addresses to prevent misalignment in the memory. It is 32-bit because STM32s are 32-bit devices. 

.text

This section contains all executable instructions and is typically stored in the flash memory.

We use *(.text) and *(.text.*) to capture all executable code. 

 

We also store all constants (.rodata) in the .text section.

The (*(.init)) and (*(.fini)) are special sections that refer to function code before/after the main() function. Anything that runs prior to main() is placed in the (*(.init)) section, and anything that runs after main() is terminated is placed in the (*(.fini)) section.

 

*(.hardfault) contains the hard fault handler code (ie: segmentation faults).

The *(.glue_7t) and *(.glue_7) sections are for veneers generated by the compiler. (Not sure, needs further research).

.bss

This is where uninitialized static memory goes to be zeroed.

 

*(.bss) and *(.bss.*) is used to capture all unitialized variables. The *(COMMON) flag includes common symbols that are defined in multiple files to be merged into a single definition. 

.data

This section contains static variables that have an initial variable during start-up. Since RAM doesn’t persist when power-cycled, the .data section is stored in the flash memory. During a boot-up, the reset-handler moves the .data section into the RAM for use, before the main() function is called. 

 

For this to be possible, all sections in the linker script have two addresses, the load address (LMA) and the virtual address (VMA). For simplicity, imagine the LMA is the memory address during rest, and the VMA is the memory address during execution. 

 

In the example below, AT (_sidata) is the LMA which is stored in the flash memory. The VMA is in RAM where the variables are copied over.

.isr_vector

The .isr_vector section holds the NVIC table which is essential in embedded devices. It holds a map to all interrupt handlers. The KEEP keyword ensures that the .isr_vector is not discarded by the linker, even if it seems to not be used. This is similar to the volatile keyword in the sense it prevents variables from being optimized out.


https://interrupt.memfault.com/blog/how-to-write-linker-scripts-for-firmware