Current layout (RTEMS 4.9 and the rce405 BSP)
The RCE memory is divided into several contiguous regions. In order by starting address:
- Exception vectors. Thus region contains small pieces of code
- Core. This is our name for the code and data used by RTEMS plus our support code from the BSP and the repository directories
rce/
andrceapp/core/
. - ISR stack. The stack used by interrupt handlers.
- RTEMS workspace. This is where RTEMS allocates the storage for run-time objects such as semaphores, API extensions, barriers, task control blocks, and so on. It may also be used for task stack space.
- Heap. RTEMS sees this a just a big heap for dynamic storage allocation. Dynamically linked software modules are allocated space here. So are objects allocated with
malloc()
or the default C++ implementation of thenew
operator. - Init stack. The stack used by the first task created by RTEMS, the Init task.
- System log. A region of memory used for three circular buffers; one for console messages, one for tracebacks and one for exception information.
Allocation
The location and size of the regions is determined by several methods. Here's a brief summary.
Region |
Start |
Size |
Allocation method |
---|---|---|---|
Exception vectors |
0x00000000 |
0x2000 |
ld script and BSP |
Core |
0x00002000 |
Variable |
Laid out by ld |
RTEMS workspace |
Just after the core |
Variable |
BSP, RTEMS and |
Interrupt stack |
Just after the workspace |
Variable |
BSP |
Heap |
Just after the interrupt stack |
Everything up to the init stack |
RTEMS + ld script |
Init stack |
Just after the heap |
0x10000 |
BSP |
System log |
0x07f00000 |
0x00100000 |
ld script and BSP |
Exception vectors
The size of this region is fixed by the processor although its location can vary (by setting a special register). Our ld script places it at the default location of zero and our BSP doesn't change the default power-on setting of the register.
Core
This region contains RTEMS itself, startup and libstdc++ code from the compiler, newlib and code from the libraries made under the RCE project in the repository. Our ld script makes the starting location at 0x2000 (just after the exception vectors). Its size depends on the total size of the code and data emitted by the compiler.
RTEMS workspace
The size and location of this region is determined by the BSP together with choices made in the RTEMS configuration; see chapter 23 of the RTEMS C User's Guide. In our builds the configuration is in one of two files: rtems_config_prod.cc
and rtems_config_devel.cc
; the former doesn't configure the RTEMS shell and the latter does.
A lot of the storage for RTEMS-provided service objects such as semaphores is allocated in the workspace, which in RTEMS 4.9 is always separate from the general-purpose heap used by the application code. Our RTEMS configurations specify fixed allocations for for these objects. It's possible to make them adjustable but, for tasks at least, the code that maintains our system log requires a fixed limit. There are also always-fixed limits for the number of device drivers and the number of open files. I believe that directory nodes for the in-memory filesystem are allocated in the workspace as well.
By default RTEMS will allocate task stacks in the workspace. It sets aside the minimum stack space per task times the maximum number of tasks, but you can add an adjustment to this if you know your application will need more stack space.
Normally RTEMS will calculate the size of the workspace based on the various allocations made in the configuration, but in the configuration you can override that by giving the total workspace size. Normally RTEMS will locate the workspace in high memory but your configuration can specify the starting address provided you know it at compile time. All the information in the configuration file gets compiled into a global structure.
Our BSP puts the workspace right after the core. In bspstart.c
:
#define INIT_STACK_SIZE 0x10000 void bsp_start( void ) { extern uint8_t __bsp_ram_start[]; extern uint8_t __bsp_ram_end[]; uint32_t addr; BSP_output_char = rce_outchar_to_memory; addr = CPU_UP_ALIGN((uint32_t)__bsp_ram_start); /* Assign different chunks of memory : */ /* work-space area */ Configuration.work_space_start = (char*)addr; addr += Configuration.work_space_size; /* Interrupt/exception stack; at the same time * initialize exceptions. */ ppc_exc_initialize( PPC_INTERRUPT_DISABLE_MASK_DEFAULT, addr, rtems_configuration_get_interrupt_stack_size()); addr += rtems_configuration_get_interrupt_stack_size(); /* reserve init stack */ bsp_heap_end = (uint32_t)__bsp_ram_end - INIT_STACK_SIZE; /* rest for the heap */ bsp_heap_start = addr; }
where I've cut out the parts not relevant to memory layout. __bsp_ram_start
is a symbol set by our ld script to be the first unused address after the core; __bsp_ram_end
is set by the script to 0x07f00000 where the system log area begins.
Interrupt stack
As you can see from the code for function bsp_start()
in the section on the RTEMS workspace, the interrupt stack is allocated right after the workspace. The size is set in the RTEMS configuration file; if you don't set the macro BSP_INTERRUPT_STACK_SIZE
you'll get the value of CONFIGURE_MINIMUM_TASK_STACK_SIZE
. If you didn't set a value for that you'll get the value of CPU_STACK_MINIMUM_SIZE
which is 8K for PowerPC.
Heap
As you can see from the code for function bsp_start()
in the section on the RTEMS workspace, the heap is allocated right after the interrupt stack and runs right up to the system log area.
Init stack
This is the stack allocated for the first task, the Init task, created and run by RTEMS. It is allocated by our bsp_start()
function just underneath the system log area and given a fixed sized of 0x10000 (64K). Other tasks will have their stacks allocated in the workspace since we don't specify a stack allocation function in the RTEMS configuration.
System log area
Our ld script sets __bsp_ram_end
to 1 MB below the true end of RAM (__phy_ram_end
== 128 MB). The BSP leaves that upper 1 MB free; it gets used by the RCE debugging package in /rce/debug/src/Manager.cc
. The first 256 KB are used as a circular buffer for printout that would have come out on the operator's console if there was one. The second 256 KB are unused. The last 512 KB are used for a circular buffer of exception information updated whenever a task causes a hardware exception or throws a C++ exception that isn't caught. In both those cases a message is also added to the printout buffer.
RTEMS 4.9 configuration options that affect memory layout
What's available
CONFIGURE_MALLOC_BSP_SUPPORTS_SBRK
. If the C heap is expandable and the BSP supplies an sbrk() function that malloc(), etc., can use when heap space runs out.
CONFIGURE_EXECUTIVE_RAM_WORK_AREA
. The starting address of the RTEMS workspace in RAM. By default his
is NULL which causes the BSP to decided where best to place the work area.
CONFIGURE_EXECUTIVE_RAM_SIZE
. If you know how big you want the RTEMS work area to be then you can bypass the usual calculation by supplying the size here.
CONFIGURE_MINIMUM_STACK_SIZE
. The stack size (bytes) RTEMS will allocate for a task/thread if you ask for the minimum. By default this is set to the minimum size recommended for the kind of CPU you're using (8K for PowerPC).
CONFIGURE_TASK_STACK_ALLOCATOR
and CONFIGURE_TASK_STACK_DEALLOCATOR
. A pointer-to-function value for the user routine that (de)allocates task stacks. The default is NULL which means that stacks will be (de)allocated from the RTEMS workspace. The allocator's prototype is void *(*)( uint32_t )
, the deallocator's is void (*)(void*)
.
CONFIGURE_MEMORY_OVERHEAD
. The number of kilobytes that should be added to the workspace size calculated by <confdefs.h>. The default is zero.
CONFIGURE_EXTRA_TASK_STACKS
. The number of bytes to add to the task stack space allocation calculated by <confdefs.h> (assuming you haven't set CONFIGURE_TASK_STACK_ALLOCATOR
). The default is zero. <confdefs.h> calculates something like CONFIGURE_MAXIMUM_TASKS*(CONFIGURE_MINIMUM_STACK_SIZE)
.
CONFIGURE_IDLE_TASK_STACK_SIZE
. The size in bytes of the stack for the idle task. The default is the configured minimum stack size for ordinary tasks.
CONFIGURE_INIT_TASK_STACK_SIZE
. Overrides the default Init task stack size.
BSP_INTERRUPT_STACK_SIZE
. Overrides the default interupt stack size (which is the configured minimum task stack size).
What we use
Our rce405 BSP sets CONFIGURE_EXTRA_TASK_STACKS
to (256*1024). For some reason it doesn't use CONFIGURE_INIT_TASK_STACK_SIZE
but hard-codes an Init stack size of 64K. CONFIGURE_EXECUTIVE_RAM_WORK_AREA
is not set in our rtems_config_*.cc
files but the configuration variable it sets is altered by our bsp_start()
routine to be right after the end of the loaded core.
RTEMS 4.10
Additional configuration options.
CONFIGURE_UNIFIED_WORK_AREAS
. Setting this will cause a single contiguous region to be used for both the heap and the RTEMS workspace.
Layout for Gen I hardware
Region type |
Starting address |
Size |
TLB entries |
PID |
Writable |
Executable |
Cached |
Contents |
---|---|---|---|---|---|---|---|---|
Executable code |
0x00000000 |
8M |
2 |
0 |
N |
Y |
Y |
Exception vectors, .text for core and modules |
Read-write data |
0x00800000 |
8M |
2 |
0 |
Y |
N |
Y |
.data+.bss for core and modules |
Read-only data |
0x01000000 |
8M |
2 |
0 |
N |
N |
Y |
.rodata for core and modules |
I/O buffers |
0x01800000 |
48M |
12 |
0 |
Y |
N |
N |
DMA targets for protocol plugins |
System log |
0x04800000 |
4M |
1 |
1 |
Y |
N |
N |
Circular message buffer |
General workspace |
0x04c00000 |
52M |
13 |
0 |
Y |
N |
Y |
RTEMS workspace, heap, Init stack, interrupt stack |
In this layout real address == virtual address.
Each type of region has properties set using the PPC405 Memory Management Unit (MMU). All the region descriptors must be held on board the MMU in its Translation Lookaside Buffer (TLB), which has space for 64 entries. With that, given the restrictions on the TLB format and that we have 128 MB of memory, we need to reserve one TLB for each 4 MB of memory, i.e., use an MMU page size of 4 MB.
Each TLB entry contains a process ID (PID) field that is matched against the contents of the PID register in the processor. A zero in this TLB field means that the PID register doesn't matter; the region is always accessible. A non-zero value must match the PID register contents exactly in order for the region to be accessible. We will arrange for the PID register to contain 1 only when running the routines that update the system log.
Note that we have the I/O buffers and the general workspace bounded by regions that are either read-only or normally inaccessible.
Subdividing the general workspace
It seems best to use the unified RTEMS workspace and heap, letting RTEMS allocate task stacks in this region. When checking the size of this unified region RTEMS assumes that it occupies the top of memory, so the simplest layout seems to be, in order of increasing addresses:
Use |
Starting address |
Size |
---|---|---|
Interrupt stack |
0x04c00000 |
8K |
Init stack |
0x04c02000 |
64K |
Unified RTEMS workspace + heap |
0x04c12000 |
51M + 952K |
If we do this and move the system log area to beneath the general workspace region then we don't need to use the trick of setting __bsp_ram_end
different from __phy_ram_end
.
RTEMS configuration settings
#define RCE_CODE_REGION_BASE (0x00000000) #define RCE_RODATA_REGION_BASE (0x01000000) #define RCE_RWDATA_REGION_BASE (0x00800000) #define RCE_BUFFER_REGION_BASE (0x01800000) #define RCE_SYSLOG_REGION_BASE (0x04800000) #define RCE_GENERAL_REGION_BASE (0x04c00000) #define CONFIGURE_UNIFIED_WORK_AREAS #define CONFIGURE_INIT_TASK_STACK_SIZE (64*1024) #define BSP_INTERRUPT_STACK_SIZE (8*1024) #define CONFIGURE_EXECUTIVE_RAM_WORK_AREA (RCE_GENERAL_REGION_BASE + BSP_INTERRUPT_STACK_SIZE + CONFIGURE_INIT_TASK_STACK_SIZE) #define CONFIGURE_TASK_STACK_ALLOCATOR RCE::service::allocateTaskStack #define CONFIGURE_TASK_STACK_DEALLOCATOR RCE::service::deallocateTaskStack
If we want truly dynamic allocation of task stacks then we have to configure an allocator and a deallocator, otherwise RTEMS will allocate a fixed chunk of memory to contain all task stacks and we would have to play with CONFIGURE_EXTRA_TASK_STACKS
.
Alternate layout for Gen I
The current version of the dynamic linker treats each module as an indivisible image which is never moved once read into memory. Until we change that we'll have to organize memory in a different fashion. Again, virtual address == real address:
Region type |
Starting address |
Size |
TLB entries |
PID |
Writable |
Executable |
Cached |
Contents |
---|---|---|---|---|---|---|---|---|
Core |
0x00000000 |
8M |
2 |
0 |
Y |
Y |
Y |
Exception vectors, text and data for the core |
Modules |
0x00800000 |
16M |
4 |
0 |
Y |
Y |
Y |
Text and data for modules |
I/O buffers |
0x01800000 |
48M |
12 |
0 |
Y |
N |
N |
DMA targets for protocol plugins |
System log |
0x04800000 |
4M |
1 |
1 |
Y |
N |
N |
Circular message buffer |
General workspace |
0x04c00000 |
52M |
13 |
0 |
Y |
N |
Y |
RTEMS workspace, heap, Init stack, interrupt stack |
Layout for Gen II
The MMU for the PPC440 processor also can store 64 TLB entries internally, but a page size of 4 MB is not available. At the high end we can have page sizes of 1 MB, 16 MB and 256 MB. If we want to be able to cover all 4 GB of memory we'll have to use a mixture of page sizes rather than just one as on the PPC405. The regions for core and module code should be small compared to those for I/O buffers and for general working storage so we can use 16 MB pages for the former and 256 MB pages for the latter.
If we have enough TLB entries left over we could allocate stacks using a mixture of smaller page sizes with unmapped regions between stacks. Since we don't know ahead of time how large any stack will be we'll have to defer the creation of its TLB entries until the time the stack is allocated.
Region type |
Starting address |
Size |
Page size |
TLB entries |
PID |
Writable |
Executable |
Cached |
Contents |
---|---|---|---|---|---|---|---|---|---|
Executable code |
0x00000000 |
256M |
256M |
1 |
0 |
N |
Y |
Y |
Exception vectors, .text for core and modules |
Read-write data |
0x10000000 |
256M |
256M |
1 |
0 |
Y |
N |
Y |
.data+.bss for core and modules |
Read-only data |
0x20000000 |
256M |
256M |
1 |
0 |
N |
N |
Y |
.rodata for core and modules |
Stacks |
0x30000000 |
256M |
1M |
Up to 50 |
0 |
Y |
Y |
N |
Interrupt, Init and task stacks |
I/O buffers |
0x40000000 |
1G+512M |
256M |
6 |
0 |
Y |
N |
N |
DMA targets for protocol plugins |
Unmapped region |
0xa0000000 |
256M |
- |
0 |
- |
- |
- |
- |
|
General workspace |
0xb0000000 |
1G+256M |
256M |
5 |
0 |
Y |
N |
Y |
RTEMS workspace, heap |