Current layout (RTEMS 4.9 and the rce405 BSP)

The RCE memory is divided into several contiguous regions. In order by starting address:

  • Exception vectors. Thus region contains small pieces of code
  • Core. This is our name for the code and data used by RTEMS plus our support code from the BSP and the repository directories rce/ and rceapp/core/.
  • ISR stack. The stack used by interrupt handlers.
  • RTEMS workspace. This is where RTEMS allocates the storage for run-time objects such as semaphores, API extensions, barriers, task control blocks, and so on. It may also be used for task stack space.
  • Heap. RTEMS sees this a just a big heap for dynamic storage allocation. Dynamically linked software modules are allocated space here. So are objects allocated with malloc() or the default C++ implementation of the new operator.
  • Init stack. The stack used by the first task created by RTEMS, the Init task.
  • System log. A region of memory used for three circular buffers; one for console messages, one for tracebacks and one for exception information.

Allocation

The location and size of the regions is determined by several methods. Here's a brief summary.

Region

Start

Size

Allocation method

Exception vectors

0x00000000

0x2000

ld script and BSP

Core

0x00002000

Variable

Laid out by ld

RTEMS workspace

Just after the core

Variable

BSP, RTEMS and rtems_config_*.cc.

Interrupt stack

Just after the workspace

Variable

BSP

Heap

Just after the interrupt stack

Everything up to the init stack

RTEMS + ld script

Init stack

Just after the heap

0x10000

BSP

System log

0x07f00000

0x00100000

ld script and BSP

Exception vectors

The size of this region is fixed by the processor although its location can vary (by setting a special register). Our ld script places it at the default location of zero and our BSP doesn't change the default power-on setting of the register.

Core

This region contains RTEMS itself, startup and libstdc++ code from the compiler, newlib and code from the libraries made under the RCE project in the repository. Our ld script makes the starting location at 0x2000 (just after the exception vectors). Its size depends on the total size of the code and data emitted by the compiler.

RTEMS workspace

The size and location of this region is determined by the BSP together with choices made in the RTEMS configuration; see chapter 23 of the RTEMS C User's Guide. In our builds the configuration is in one of two files: rtems_config_prod.cc and rtems_config_devel.cc; the former doesn't configure the RTEMS shell and the latter does.

A lot of the storage for RTEMS-provided service objects such as semaphores is allocated in the workspace, which in RTEMS 4.9 is always separate from the general-purpose heap used by the application code. Our RTEMS configurations specify fixed allocations for for these objects. It's possible to make them adjustable but, for tasks at least, the code that maintains our system log requires a fixed limit. There are also always-fixed limits for the number of device drivers and the number of open files. I believe that directory nodes for the in-memory filesystem are allocated in the workspace as well.

By default RTEMS will allocate task stacks in the workspace. It sets aside the minimum stack space per task times the maximum number of tasks, but you can add an adjustment to this if you know your application will need more stack space.

Normally RTEMS will calculate the size of the workspace based on the various allocations made in the configuration, but in the configuration you can override that by giving the total workspace size. Normally RTEMS will locate the workspace in high memory but your configuration can specify the starting address provided you know it at compile time. All the information in the configuration file gets compiled into a global structure.

Our BSP puts the workspace right after the core. In bspstart.c:

#define INIT_STACK_SIZE 0x10000

void bsp_start( void )
{
  extern uint8_t __bsp_ram_start[];
  extern uint8_t __bsp_ram_end[];

  uint32_t addr;

  BSP_output_char = rce_outchar_to_memory;

  addr = CPU_UP_ALIGN((uint32_t)__bsp_ram_start);

  /* Assign different chunks of memory : */

  /* work-space area */
  Configuration.work_space_start = (char*)addr;
  addr += Configuration.work_space_size;

  /* Interrupt/exception stack; at the same time
   * initialize exceptions.
   */
  ppc_exc_initialize(
		PPC_INTERRUPT_DISABLE_MASK_DEFAULT,
		addr,
  		rtems_configuration_get_interrupt_stack_size());

  addr += rtems_configuration_get_interrupt_stack_size();

  /* reserve init stack */
  bsp_heap_end   = (uint32_t)__bsp_ram_end - INIT_STACK_SIZE;

  /* rest for the heap */
  bsp_heap_start = addr;

}

where I've cut out the parts not relevant to memory layout. __bsp_ram_start is a symbol set by our ld script to be the first unused address after the core; __bsp_ram_end is set by the script to 0x07f00000 where the system log area begins.

Interrupt stack

As you can see from the code for function bsp_start() in the section on the RTEMS workspace, the interrupt stack is allocated right after the workspace. The size is set in the RTEMS configuration file; if you don't set the macro BSP_INTERRUPT_STACK_SIZE you'll get the value of CONFIGURE_MINIMUM_TASK_STACK_SIZE. If you didn't set a value for that you'll get the value of CPU_STACK_MINIMUM_SIZE which is 8K for PowerPC.

Heap

As you can see from the code for function bsp_start() in the section on the RTEMS workspace, the heap is allocated right after the interrupt stack and runs right up to the system log area.

Init stack

This is the stack allocated for the first task, the Init task, created and run by RTEMS. It is allocated by our bsp_start() function just underneath the system log area and given a fixed sized of 0x10000 (64K). Other tasks will have their stacks allocated in the workspace since we don't specify a stack allocation function in the RTEMS configuration.

System log area

Our ld script sets __bsp_ram_end to 1 MB below the true end of RAM (__phy_ram_end == 128 MB). The BSP leaves that upper 1 MB free; it gets used by the RCE debugging package in /rce/debug/src/Manager.cc. The first 256 KB are used as a circular buffer for printout that would have come out on the operator's console if there was one. The second 256 KB are unused. The last 512 KB are used for a circular buffer of exception information updated whenever a task causes a hardware exception or throws a C++ exception that isn't caught. In both those cases a message is also added to the printout buffer.

RTEMS 4.9 configuration options that affect memory layout

What's available

CONFIGURE_MALLOC_BSP_SUPPORTS_SBRK. If the C heap is expandable and the BSP supplies an sbrk() function that malloc(), etc., can use when heap space runs out.

CONFIGURE_EXECUTIVE_RAM_WORK_AREA. The starting address of the RTEMS workspace in RAM. By default his
is NULL which causes the BSP to decided where best to place the work area.

CONFIGURE_EXECUTIVE_RAM_SIZE. If you know how big you want the RTEMS work area to be then you can bypass the usual calculation by supplying the size here.

CONFIGURE_MINIMUM_STACK_SIZE. The stack size (bytes) RTEMS will allocate for a task/thread if you ask for the minimum. By default this is set to the minimum size recommended for the kind of CPU you're using (8K for PowerPC).

CONFIGURE_TASK_STACK_ALLOCATOR and CONFIGURE_TASK_STACK_DEALLOCATOR. A pointer-to-function value for the user routine that (de)allocates task stacks. The default is NULL which means that stacks will be (de)allocated from the RTEMS workspace. The allocator's prototype is void *(*)( uint32_t ), the deallocator's is void (*)(void*).

CONFIGURE_MEMORY_OVERHEAD. The number of kilobytes that should be added to the workspace size calculated by <confdefs.h>. The default is zero.

CONFIGURE_EXTRA_TASK_STACKS. The number of bytes to add to the task stack space allocation calculated by <confdefs.h> (assuming you haven't set CONFIGURE_TASK_STACK_ALLOCATOR). The default is zero. <confdefs.h> calculates something like CONFIGURE_MAXIMUM_TASKS*(CONFIGURE_MINIMUM_STACK_SIZE).

CONFIGURE_IDLE_TASK_STACK_SIZE. The size in bytes of the stack for the idle task. The default is the configured minimum stack size for ordinary tasks.

CONFIGURE_INIT_TASK_STACK_SIZE. Overrides the default Init task stack size.

BSP_INTERRUPT_STACK_SIZE. Overrides the default interupt stack size (which is the configured minimum task stack size).

What we use

Our rce405 BSP sets CONFIGURE_EXTRA_TASK_STACKS to (256*1024). For some reason it doesn't use CONFIGURE_INIT_TASK_STACK_SIZE but hard-codes an Init stack size of 64K. CONFIGURE_EXECUTIVE_RAM_WORK_AREA is not set in our rtems_config_*.cc files but the configuration variable it sets is altered by our bsp_start() routine to be right after the end of the loaded core.

RTEMS 4.10

Additional configuration options.

CONFIGURE_UNIFIED_WORK_AREAS. Setting this will cause a single contiguous region to be used for both the heap and the RTEMS workspace.

Layout for Gen I hardware

Region type

Starting address

Size

TLB entries

PID

Writable

Executable

Cached

Contents

Executable code

0x00000000

8M

2

0

N

Y

Y

Exception vectors, .text for core and modules

Read-write data

0x00800000

8M

2

0

Y

N

Y

.data+.bss for core and modules

Read-only data

0x01000000

8M

2

0

N

N

Y

.rodata for core and modules

I/O buffers

0x01800000

48M

12

0

Y

N

N

DMA targets for protocol plugins

System log

0x04800000

4M

1

1

Y

N

N

Circular message buffer

General workspace

0x04c00000

52M

13

0

Y

N

Y

RTEMS workspace, heap, Init stack, interrupt stack

In this layout real address == virtual address.

Each type of region has properties set using the PPC405 Memory Management Unit (MMU). All the region descriptors must be held on board the MMU in its Translation Lookaside Buffer (TLB), which has space for 64 entries. With that, given the restrictions on the TLB format and that we have 128 MB of memory, we need to reserve one TLB for each 4 MB of memory, i.e., use an MMU page size of 4 MB.

Each TLB entry contains a process ID (PID) field that is matched against the contents of the PID register in the processor. A zero in this TLB field means that the PID register doesn't matter; the region is always accessible. A non-zero value must match the PID register contents exactly in order for the region to be accessible. We will arrange for the PID register to contain 1 only when running the routines that update the system log.

Note that we have the I/O buffers and the general workspace bounded by regions that are either read-only or normally inaccessible.

Subdividing the general workspace

It seems best to use the unified RTEMS workspace and heap, letting RTEMS allocate task stacks in this region. When checking the size of this unified region RTEMS assumes that it occupies the top of memory, so the simplest layout seems to be, in order of increasing addresses:

Use

Starting address

Size

Interrupt stack

0x04c00000

8K

Init stack

0x04c02000

64K

Unified RTEMS workspace + heap

0x04c12000

51M + 952K

If we do this and move the system log area to beneath the general workspace region then we don't need to use the trick of setting __bsp_ram_end different from __phy_ram_end.

RTEMS configuration settings

#define RCE_CODE_REGION_BASE    (0x00000000)
#define RCE_RODATA_REGION_BASE  (0x01000000)
#define RCE_RWDATA_REGION_BASE  (0x00800000)
#define RCE_BUFFER_REGION_BASE  (0x01800000)
#define RCE_SYSLOG_REGION_BASE  (0x04800000)
#define RCE_GENERAL_REGION_BASE (0x04c00000)


#define CONFIGURE_UNIFIED_WORK_AREAS
#define CONFIGURE_INIT_TASK_STACK_SIZE    (64*1024)
#define BSP_INTERRUPT_STACK_SIZE          (8*1024)
#define CONFIGURE_EXECUTIVE_RAM_WORK_AREA (RCE_GENERAL_REGION_BASE + BSP_INTERRUPT_STACK_SIZE + CONFIGURE_INIT_TASK_STACK_SIZE)
#define CONFIGURE_TASK_STACK_ALLOCATOR   RCE::service::allocateTaskStack
#define CONFIGURE_TASK_STACK_DEALLOCATOR RCE::service::deallocateTaskStack

If we want truly dynamic allocation of task stacks then we have to configure an allocator and a deallocator, otherwise RTEMS will allocate a fixed chunk of memory to contain all task stacks and we would have to play with CONFIGURE_EXTRA_TASK_STACKS.

Alternate layout for Gen I

The current version of the dynamic linker treats each module as an indivisible image which is never moved once read into memory. Until we change that we'll have to organize memory in a different fashion. Again, virtual address == real address:

Region type

Starting address

Size

TLB entries

PID

Writable

Executable

Cached

Contents

Core

0x00000000

8M

2

0

Y

Y

Y

Exception vectors, text and data for the core

Modules

0x00800000

16M

4

0

Y

Y

Y

Text and data for modules

I/O buffers

0x01800000

48M

12

0

Y

N

N

DMA targets for protocol plugins

System log

0x04800000

4M

1

1

Y

N

N

Circular message buffer

General workspace

0x04c00000

52M

13

0

Y

N

Y

RTEMS workspace, heap, Init stack, interrupt stack

Layout for Gen II

The MMU for the PPC440 processor also can store 64 TLB entries internally, but a page size of 4 MB is not available. At the high end we can have page sizes of 1 MB, 16 MB and 256 MB. If we want to be able to cover all 4 GB of memory we'll have to use a mixture of page sizes rather than just one as on the PPC405. The regions for core and module code should be small compared to those for I/O buffers and for general working storage so we can use 16 MB pages for the former and 256 MB pages for the latter.

If we have enough TLB entries left over we could allocate stacks using a mixture of smaller page sizes with unmapped regions between stacks. Since we don't know ahead of time how large any stack will be we'll have to defer the creation of its TLB entries until the time the stack is allocated.

Region type

Starting address

Size

Page size

TLB entries

PID

Writable

Executable

Cached

Contents

Executable code

0x00000000

256M

256M

1

0

N

Y

Y

Exception vectors, .text for core and modules

Read-write data

0x10000000

256M

256M

1

0

Y

N

Y

.data+.bss for core and modules

Read-only data

0x20000000

256M

256M

1

0

N

N

Y

.rodata for core and modules

Stacks

0x30000000

256M

1M

Up to 50

0

Y

Y

N

Interrupt, Init and task stacks

I/O buffers

0x40000000

1G+512M

256M

6

0

Y

N

N

DMA targets for protocol plugins

Unmapped region

0xa0000000

256M

-

0

-

-

-

-

 

General workspace

0xb0000000

1G+256M

256M

5

0

Y

N

Y

RTEMS workspace, heap