Ubuntu Manpage: mem_sections

NAME

       mem_sections - Memory Sections

       Section are used to organize code and data of a program on the binary level.

       The (compiler-generated) assembly code assigns code, data and other entities like debug information to so
       called input sections. These sections serve as input to the linker, which bundles similar sections
       together to output sections like .text and .data according to rules defined in the linker description
       file.

       The final ELF binary is then used by programming tools like avrdude, simulators, debuggers and other
       programs, for example programs from the GNU Binutils family like avr-size, avr-objdump and avr-readelf.

       Sections may have extra properties like section alignment, section flags, section type and rules to
       locate them or to assign them to memory regions.

       • Concepts

         • Named Sections

           • Section Flags

           • Section Type

           • Section Alignment

           • Subsections

         • Orphan Sections

         • LMA: Load Memory Address

         • VMA: Virtual Memory Address

       • The Linker Script: Building Blocks

         • Input Sections and Output Sections

         • Memory Regions

       • Output Sections of the Default Linker Script

         • .text

         • .data

         • .bss

         • .noinit

         • .rodata

         • .eeprom

         • .fuse, .lock and .signature

         • .note.gnu.avr.deviceinfo

       • Symbols in the Default Linker Script

       • Output Sections and Code Size

       • Using Sections

         • In C/C++ Code

         • In Assembly Code

Concepts

Named Sections
Named sections are sections that can be referred to by their name. The name and other properties can be
provided with the .section directive like in

.section name, "flags", @type

or with the .pushsection directive, which directs the assembler to assemble the following code into the
named section.

An example of a section that is not referred to by its name is the COMMON section. In order to put an
object in that section, special directives like .comm name,size or .lcomm name,size have to be used.

Directives like .text are basically the same like .section .text, where the assembler assumes appropriate
section flags and type; same for directives .data and .bss.

Section Flags
The section flags can be specified with the .section and .pushsection directives, see section type for an
example. Section flags of output sections can be specified in the linker description file, and the linker
implements heuristics to determine the section flags of output sections from the various input section
that go into it.

Flag Meaning a The section will be allocated, i.e. it occupies space on the target hardware w The
section contains data that can be written at run-time. Sections that only contain read-only entities
don't have the w flag set x The section contains executable code, though the section may also contain
non-executable objects M A mergeable section S A string section G A section group, like used with
comdat objects

The last three flags are listed for completeness. They are used by the compiler, for example for header-
only C++ modules and to ensure that multiplle instanciations of the same template in different compilaton
units does occur at most once in the executable file.

Section Type
The section type can be specified with the .section and .pushsection directives, like in

.section .text.myfunc,"ax",@progbits
.pushsection ".data.myvar", "a", "progbits"

On ELF level, the section type is stored in the section header like Elf32_Shdr.sh_type = SHT_PROGBITS.

Type Meaning @progbits The section contains data that will be loaded to the target, like objects in the
.text and .data sections. @nobits The section does not contain data that needs to be transferred to the
target device, like data in the .bss and .noinit sections. The section still occupies space on the
target. @note The section is a note, like for example the .note.gnu.avr.deviceinfo section.

Section Alignment
The alignment of a section is the maximum over the alignments of the objects in the section.

Subsections
Subsections are compartments of named sections and are introduced with the .subsection directive.
Subsections are located in order of increasing index in their input section. The default subsection after
switching to a new section is subsection 0.

Note
A common misconception is that a section like .text.module.func were a subsection of .text.module.
This is not the case. These two sections are independent, and there is no subset relation. The
sections may have different flags and type, and they may be assigned to different output sections.

Orphan Sections
Orphan sections are sections that are not mentioned in the linker description file. When an input section
is orphan, then the GNU linker implicitly generates an output section of the same name. The linker
implements various heuristics to determine sections flags, section type and location of orphaned
sections. One use of orphan sections is to locate code to a fixed address.

Like for any other output section, the start address can be specified by means of linking with
-Wl,--section-start,secname=address

LMA: Load Memory Address
The LMA of an object is the address where a loader like avrdude puts the object when the binary is being
uploaded to the target device.

VMA: Virtual Memory Address
The VMA is the address of an object as used by the running program.

VMA and LMA may be different: Suppose a small ATmega8 program with executable code that extends from byte
address 0x0 to 0x20f, and one variable my_var in static strorage. The default linker script puts the
content of the .data output section after the .text output section and into the text segment. The startup
code then copies my_data from its LMA location beginning at 0x210 to its VMA location beginning at
0x800060, because C/C++ requires that all data in static storage must have been initialized when main is
entered.

The internal SRAM of ATmega8 starts at RAM address 0x60, which is offset by 0x800000 in order to
linearize the address space (VMA 0x60 is a flash address). The AVR program only ever uses the lower 16
bits of VMAs in static storage so that the offset of 0x800000 is masked out. But code like 'LDI
r24,hh8(my_data)' actually sets R24 to 0x80 and reveals that my_data is an object located in RAM.

The Linker Script: Building Blocks

       The linker description file is the central hub to channel functions and static storage objects of a
       program to the various memory spaces and address ranges of a device.

   Input Sections and Output Sections
       Input sections are sections that are inputs to the linker. Functions and static variables but also
       additional notes and debug information are assigned to different input sections by means of assembler
       directives like .section or .text. The linker takes all these sections and assigns them to output
       sections as specified in the linker script.

       Output sections are defind in the linker description file. Contrary to the unlimited number of input
       sections a program can come up with, there is only a handfull of output sections like .text and .data,
       that roughly correspond to the memory spaces of the target device.

       One step in the final link is to locate the sections, that is the linker/locator determines at which
       memory location to put the output sections, and how to arrange the many input sections within their
       assigned output section. Locating means that the linker assigns Load Memory Addresses --- addresses as
       used by a loader like avrdude --- and Virtual Memory Addresses, which are the addresses as used by the
       running program.

       While it is possible to directly assign LMAs and VMAs to output sections in the linker script, the
       default linker scripts provided by Binutils assign memory regions (aka. memory segments) to the output
       sections. This has some advantages like a linker script that is easier to maintain. An output sections
       can be assigned to more than one memory region. For example, non-zero data in static storage (.data) goes
       to

       1.  the data region (VMA), because such variables occupy RAM which has to be allocated

       2.  the text region (LMA), because the initializers for such data has to be kept in some non-volatile
           memory (program ROM), so that the startup code can initialize that data so that the variables have
           their expected initial values when main() is entered.

       The SECTIONS{} portion of a linker script models the input and output section, and it assignes the output
       section to the memory regions defined in the MEMORY{} part.

   Memory Regions
       The memory regions defined in the default linker script model and correspond to the different kinds of
       memories of a device.

       Region Virtual
       Address1  Flags Purpose  text 02  rx Executable code, vector table, data in PROGMEM, __flash and __memx,
       startup code, linker stubs, initializers for .data  data 0x8000002  rw Data in static storage  rodata3
       0xa000002  r Read-only data in static storage  eeprom 0x810000 rw EEPROM data  fuse 0x820000 rw Fuse
       bytes  lock 0x830000 rw Lock bytes  signature 0x840000 rw Device signature  user_signatures 0x850000 rw
       User signature

       Notes

       1.  The VMAs for regions other than text are offset in order to linearize the non-linear memory address
           space of the AVR Harvard architecture. The target code only ever uses the lower 16 bits of the VMA to
           access objects in non-text regions.

       2.  The addresses for regions text, data and rodata are actually defined as symbols like
           __TEXT_REGION_ORIGIN__, so that they can be adjusted by means of, say
           -Wl,--defsym,__DATA_REGION_ORIGIN__=0x800060. Same applies for the lengths of all the regions, which
           is __NAME_REGION_LENGTH__ for region name.

       3.  The rodata region is only present in the avrxmega2_flmap and avrxmega4_flmap emulations, which is the
           case for Binutils since v2.42 for the AVR64 and AVR128 devices without -mrodata-in-ram.

Output Sections of the Default Linker Script

       This section describes the various output sections defined in the default linker description files.

       Output Purpose Memory Region  Section LMA VMA  .text Executable code, data in progmem text text  .data
       Non-zero data in static storage text data  .bss Zero data in static storage --- data  .noinit Non-
       initialized data in static storage --- data  .rodata1  Read-only data in static storage text LMA +
       offset3   .rodata2  Read-only data in static storage 0x8000 * __flmap4  rodata  .eeprom Data in EEPROM
       Note5  eeprom  .fuse Fuse bytes fuse  .lock Lock bytes lock  .signature Signature bytes signature  User
       signature bytes user_signatures

       Notes

       1.  On avrxmega3 and avrtiny devices.

       2.  On AVR64 and AVR128 devices without -mrodata-in-ram.

       3.  With an offset __RODATA_PM_OFFSET__ of 0x4000 or 0x8000 depending on the device.

       4.  The value of symbol __flmap defaults to the last 32 KiB block of program memory, see the GCC v14
           release notes.

       5.  The LMA actually equals the VMA, but is unused. The flash loader like avrdude knows where to put the
           data,

   The .text Output Section
       The .text output section contains the actual machine instructions which make up the program, but also
       additional code like jump tables and lookup tables placed in program memory with the PROGMEM attribute.

       The .text output section contains the input sections described below. Input sections that are not used by
       the tools are omitted. A * wildcard stands for any sequence of characters, including empty ones, that are
       valid in a section name.

       .vectors
           The .vectors sections contains the interrupt vector table which consists of jumps to weakly defined
           labels: To __init for the first entry at index 0, and to __vector_N for the entry at index N  1. The
           default value for __vector_N is __bad_interrupt, which jumps to weakly defined __vector_default,
           which jumps to __vectors, which is the start of the .vectors section.

       Implementing an interrupt service ruotine (ISR) is performed with the help of the ISR macro in C/C++
       code.

       .progmem.data

       .progmem.data.*

       .progmem.gcc.*
           This section is used for read-only data declared with attribute PROGMEM, and for data in address-
           space __flash.

       The compiler assumes that the .progmem sectons are located in the lower 64 KiB of program memory. When it
       does not fit in the lower 64 KiB block, then the program reads garbage except pgm_read_*_far is used. In
       that case however, code can be located in the .progmemx section which does not require to be located in
       the lower program memory.

       .trampolines
           Linker stubs for indirect jumps and calls on devices with more than 128 KiB of program memory. This
           section must be located in the same 128 KiB block like the interrupt vector table. For some
           background on linker stubs, see the GCC documentation on EIND.

       .text

       .text.*
           Executable code. This is where almost all of the executable code of an application will go.

       .ctors

       .dtors
           Tables with addresses of static constructors and destructors, like C++ static constructors and
           functions declared with attribute constructor.

       The .initN Sections
           These sections are used to hold the startup code from reset up through the start of main().

       The .initN sections are executed in order from 0 to 9: The code from one init section falls through to
       the next higher init section. This is the reason for why code in these sections must be naked (more
       precisely, it must not contain return instructions), and why code in these sections must never be called
       explicitly.

       When several modules put code in the same init section, the order of execuation is not specified.

       Section Performs Hosted By Symbol1   .init0 Weakly defines the __init label which is the jump target of
       the first vector in the interrupt vector table. When the user defines the __init() function, it will be
       jumped to instead. AVR-LibC2  .init1 Unused ---  .init2

       • Clears __zero_reg__

       • Initializes the stack pointer to the value of weak symbol __stack, which has a default value of RAMEND
         as defined in avr/io.h

       • Initializes EIND to hh8(pm(__vectors)) on devices that have it

       • Initializes RAMPX, RAMPY, RAMPZ and RAMPD on devices that have all of them

       AVR-LibC  .init3 Initializes the NVMCTRLB.FLMAP bit-field on devices that have it, except when -mrodata-
       in-ram is specified AVR-LibC __do_flmap_init  .init4 Initializes data in static storage: Initializes
       .data and clears .bss libgcc __do_copy_data
       __do_clear_bss  .init5 Unused --- .init6 Run static C++ constructors and functions defined with
       __attribute__((constructor)). libgcc __do_global_ctors  .init7 Unused --- .init8 Unused ---  .init9 Calls
       main and then jumps to exit AVR-LibC

       Notes

       1.  Code in the .init3, .init4 and .init6 sections is optional; it will only be present when there is
           something to do. This will be tracked by the compiler --- or has to be tracked by the assembly
           programmer --- which pulls in the code from the respective library by means of the mentioned symbols,
           e.g. by linking with -Wl,-u,__do_flmap_init or by means of

       .global __do_copy_data

        Conversely, when the respective code is not desired for some reason, the symbol can be satisfied by
       defining it with, say, -Wl,--defsym,__do_copy_data=0 so that the code is not pulled in any more.

       2.  The code is provided by gcrt1.S.

       The .finiN Sections
           Shutdown code. These sections are used to hold the exit code executed after return from main() or a
           call to exit().

       The .finiN sections are executed in descending order from 9 to 0 in a fallthrough manner.

       Section Performs Hosted By Symbol  .fini9 Defines _exit and weakly defines the exit label libgcc .fini8
       Run functions registered with atexit() AVR-LibC  .fini7 Unused ---  .fini6 Run static C++ destructors and
       functions defined with __attribute__((destructor)) libgcc __do_global_dtors  .fini5...1 Unused --- .fini0
       Globally disables interrupts and enters an infinite loop to label __stop_program libgcc  It is unlikely
       that ordinary code uses the fini sections. When there are no static destructors and atexit() is not used,
       then the respective code is not pulled in form the libraries, and the fini code just consumes four bytes:
       a CLI and a RJMP to itself. Common use cases of fini code is when running the GCC test suite where it
       reduces fallout, and in simulators to determine (un)orderly termination of a simulated program.

       .progmemx.*
           Read-only data in program memory without the requirement that it must reside in the lower 64 KiB. The
           compiler uses this section for data in the named address-space __memx. Data can be accessed with
           pgm_read_*_far when it is not in a named address-space:

       #include <avr/pgmspace.h>

       const __memx int array1[] = { 1, 4, 9, 16, 25, 36 };

       PROGMEM_FAR
       const int array2[] = { 2, 3, 5, 7, 11, 13, 17 };

       int add (uint8_t id1, uint8_t id2)
       {
           uint_farptr_t p_array2 = pgm_get_far_address (array2);
           int val2 = pgm_read_int_far (p_array2 + sizeof(int) * id2);

           return val2 + array1[id1];
       }

       .jumptables*
           Used to place jump tables in some cases.

   The .data Output Section
       This section contains data in static storage which has an initializer that is not all zeroes. This
       includes the following input sections:

       .data*
           Read-write data

       .rodata*
           Read-only data. These input sections are only included on devices that host read-only data in RAM.

       It is possible to tell the linker the SRAM address of the beginning of the .data section. This is
       accomplished by linking with

       avr-gcc ... -Tdata addr -Wl,--defsym,__DATA_REGION_START__=addr

       Note that addr must be offset by adding 0x800000 the to real SRAM address so that the linker knows that
       the address is in the SRAM memory segment. Thus, if you want the .data section to start at 0x1100, pass
       0x801100 as the address to the linker.

       Note
           When using malloc() in the application (which could even happen inside library calls), additional
           adjustments are required.

   The .bss Output Section
       Data in static storage that will be zeroed by the startup code. This are data objects without explicit
       initializer, and data objects with initializers that are all zeroes.

       Input sections are .bss* and COMMON. Common symbols are defined with directives .comm or .lcomm.

   The .noinit Output Section
       Data objects in static storage that should not be initialized by the startup code. As the C/C++ standard
       requires that all data in static storage is initialized --- which includes data without explicit
       initializer, which will be initialized to all zeroes --- such objects have to be put into section .noinit
       by hand:

       __attribute__ ((section (".noinit")))
       int foo;

        The only input section in this output section is .noinit. Only data without initializer can be put in
       this section.

   The .rodata Output Section
       This section contains read-only data in static storage from .rodata* input sections. This output section
       is only present for devices where read-only data remains in program memory, which are the devices where
       (parts of) the program memory are visible in the RAM address space. This is currently the case for the
       emulations avrtiny, avrxmega3, avrxmega2_flmap and avrxmega4_flmap.

   The .eeprom Output Section
       This is where EEPROM variables are stored, for example variables declared with the EEMEM attribute. The
       only input section (pattern) is .eeprom*.

   The .fuse, .lock and .signature Output Sections
       These sections contain fuse bytes, lock bytes and device signature bytes, respectively. The respective
       input section patterns are .fuse* .lock* and .signature*.

   The .note.gnu.avr.deviceinfo Section
       This section is actually not mentioned in the default linker script, which means it is an orphan section
       and hence the respective output section is implicit.

       The startup code from AVR-LibC puts device information in that section to be picked up by simulators or
       tools like avr-size, avr-objdump, avr-readelf, etc,

       The section is contained in the ELF file but not loaded onto the target. Source of the device specific
       information are the device header file and compiler builtin macros. The layout conforms to the standard
       ELF note section layout and is laid out as follows.

       #include <elf.h>

       typedef struct
       {
           Elf32_Word n_namesz;     /* AVR_NOTE_NAME_LEN */
           Elf32_Word n_descsz;     /* size of avr_desc */
           Elf32_Word n_type;       /* 1 - the only AVR note type */
       } Elf32_Nhdr;

       #define AVR_NOTE_NAME_LEN 4

       struct note_gnu_avr_deviceinfo
       {
           Elf32_Nhdr nhdr;
           char note_name[AVR_NOTE_NAME_LEN]; /* = "AVR\0" */

           struct
           {
               Elf32_Word flash_start;
               Elf32_Word flash_size;
               Elf32_Word sram_start;
               Elf32_Word sram_size;
               Elf32_Word eeprom_start;
               Elf32_Word eeprom_size;
               Elf32_Word offset_table_size;
               /* Offset table containing byte offsets into
                  string table that immediately follows it.
                  index 0: Device name byte offset */
               Elf32_Off offset_table[1];
               /* Standard ELF string table.
                  index 0 : NULL
                  index 1 : Device name
                  index 2 : NULL */
               char strtab[2 + strlen(__AVR_DEVICE_NAME__)];
           } avr_desc;
       };

       The contents of this section can be displayed with

       • avr-objdump -P avr-deviceinfo file, which is supported since Binutils v2.43.

       • avr-readelf -n file, which displays all notes.

Symbols in the Default Linker Script

       Most of the symbols like main are defined in the code of the application, but some symbols are defined in
       the default linker script:

       __name_REGION_ORIGIN__
           Describes the physical properties of memory region name, where name is one of TEXT or DATA. The
           address is a VMA and offset at explained above.
            The linker script only supplies a default for the symbol values when they have not been defined by
           other means, like for example in the startup code or by --defsym. For example, to let the code start
           at address 0x100, one can link with

       avr-gcc ... -Ttext=0x100 -Wl,--defsym,__TEXT_REGION_ORIGIN__=0x100

       __name_REGION_LENGTH__
           Describes the physical properties of memory region name, where name is one of: TEXT, DATA, EEPROM,
           LOCK, FUSE, SIGNATURE or USER_SIGNATURE.
            Only a default is supplied when the symbol is not yet defined by other means. Most of these symbols
           are weakly defined in the startup code.

       __data_start

       __data_end
           Start and (one past the) end VMA address of the .data section in RAM.

       __data_load_start

       __data_load_end
           Start and (one past the) end LMA address of the .data section initializers located in program memory.
           Used together with the VMA addresses above by the startup code to copy data initializers from program
           memory to RAM.

       __bss_start

       __bss_end
           Start and (one past the) end VMA address of the .bss section. The startup code clears this part of
           the RAM.

       __rodata_start

       __rodata_end

       __rodata_load_start

       __rodata_load_end
           Start and (one past the) end VMA resp. LMA address of the .rodata output section. These symbols are
           only defined when .rodata is not output to the text region, which is the case for emulations
           avrxmega2_flmap and avrxmega4_flmap.

       __heap_start
           One past the last object located in static storage. Immediately follows the .noinit section (which
           immediately follows .bss, which immediately follows .data). Used by malloc() and friends.

       Code that computes a checksum over all relevant code and data in program memory has to consider:

       • The range from the beginning of the .text section (address 0x0 in the default layout) up to
         __data_load_start.

       • For emulations that have the rodata memory region, the range from __rodata_load_start to
         __rodata_load_end has also to be taken into account.

Output Sections and Code Size

       The avr-size program (part of Binutils), coming from a Unix background, doesn't account for the .data
       initialization space added to the .text section, so in order to know how much flash the final program
       will consume, one needs to add the values for both, .text and .data (but not .bss), while the amount of
       pre-allocated SRAM is the sum of .data and .bss.

       Memory usage and free memory can also be displayed with

       avr-objdump -P mem-usage code.elf

Using Sections

   In C/C++ Code
       The following example shows how to read and reset the MCUCR special function register on ATmega328. This
       SFR holds to reset source like 'watchdog reset' or 'external reset', and should be read early, prior to
       the initialization of RAM and execution of static constructors which may take some time. This means the
       code has to be placed prior to .init4 which initializes static storage, but after .init2 which
       initializes __zero_reg__. As the code runs prior to the initialization of static storage, variable mcucr
       must be placed in section .noinit so that it won't be overridden by that part of the startup code:

       #include <avr/io.h>

       __attribute__((section(".noinit")))
       uint8_t mcucr;

       __attribute__((used, unused, naked, section(".init3")))
       static void read_MCUCR (void)
       {
           mcucr = MCUCR;
           MCUCR = 0;
       }

       • The used attribute tells the compiler that the function is used although it is never called.

       • The unused attribute tells the compiler that it is fine that the function is unused, and silences
         respective diagnostics about the seemingly unused functions.

       • The naked attribute is required because the code is located in an init section. The function must not
         have a RET statement because the function is never called. According to the GCC documentation, the only
         code supported in naked functions is inline assembly, but the code above is simple enough so that GCC
         can deal with it.

   In Assembly Code
       Example:

       #include <avr/io.h>

       .section .init3,"ax",@progbits
           lds     r0, MCUCR

       .pushsection .noinit,"a",@nobits
       mcucr:
           .type   mcucr, @object
           .size   mcucr, 1
           .space  1
       .popsection                     ; Proceed with .init3

           sts     mcucr, r0
           sts     MCUCR, __zero_reg__ ; Initialized in .init2

       .text
           .global main
           .type   main, @function
           lds     r24,    mcucr
           clr     r25
           rjmp    putchar
           .size main, .-main

       • The 'ax' flags tells that the sections is allocatable (consumes space on the target hardware) and is
         executable.

       • The @progbits type tells that the section contains bits that have to be uploaded to the target
         hardware.

       For more detais, see the see the gas user manual on the .section directive.