Debugging the DAT on an RCE

Debugging code on an embedded system is a difficult problem. The basic features of embedded environments – isolation, limited code space – usually preclude having a fully implemented shell with the entire development tool chain present on the dedicated processor. Many commercial vendors (VxWorks, Xilinx) include remote debuggers usable over various connection protocols, but for RTEMS, there wasn't a good way to run debuggers on a large cluster of embedded processors. In this document we discuss the two debugging options we have available for dealing with RCEs: XMD and The GDB stub.

Xilinx Microprocessor Debugger (xmd)

XMD (Xilinx Microprocessor Debugger) is a low-level (assembly language) debugger that is provided by Xilinx as part of a product called the Embedded Development Kit (EDK). Documentation is in the Embedded System Tools Reference Manual (UG111) (http://xgoogle.xilinx.com/search?q=Embedded+System+Tools+Reference+Manual&btnG=New+Search&getfields=*&numgm=5&filter=0&proxystylesheet=support&client=support&output=xml_no_dtd&oe=UTF-8&ie=UTF-8&getfields=*&show_dynamic_navigation=1&num=1000&submit=Search&lang2search=&ud=1&exclude_apps=1&site=Documentation). XMD communicates with the PowerPC System-On-Chip over a JTAG interface via a host computer's USB system. The secret of a large portion of this magic is that it is able to inject instructions into the processor over JTAG (interestingly, these instructions can be recorded to text files and disassembled with some effort). XMD and the EDK is installed on SLAC AFS (e.g., /afs/slac/g/reseng/xilinx).

The main features of XMD that the DAT group has used are:

Download/control PowerPC ELF executables
Control XMD via GDB
Read/Write memory (both off-chip and integrated on-chip FPGA Block RAM)
Read/Write PowerPC cache/tags
Read/Write PowerPC registers
Read/Write registers on the DCR bus

However, the XMD application has some deficiencies:

It is not a full source code debugger
It requires a USB cable and JTAG dongle to talk to the target
Thread-related commands, e.g., "info thread" or "thread 1", don't work

Here's an example XMD session:

user@rdcds104>> xmd 
...
XMD% stop
XMD% rst -processor
Target reset successfully

XMD% dow /reg/lab1/home/panetta/dat/foo_install/bin/ppc-rtems-rce405-dbg/core.2.0.devel
System Reset .... DONE
Downloading Program -- /reg/lab1/home/panetta/dat/foo_install/bin/ppc-rtems-rce405-dbg/core.2.0.devel
        section, .init: 0x0007c600-0x0007c63f
        section, .text: 0x0007c640-0x001f6be3
        section, .fini: 0x001f6be4-0x001f6c03
        section, .got: 0x00249710-0x0024971f
        section, .interp: 0x00002000-0x00002010
        section, .dynamic: 0x00002014-0x0000208b
        section, .hash: 0x0000208c-0x0000df9f
        section, .dynsym: 0x0000dfa0-0x0002db9f
        section, .dynstr: 0x0002dba0-0x0007c5fe
        section, .rodata: 0x001f6c08-0x00214263
        section, .sbss2: 0x00214264-0x00214264
        section, .eh_frame: 0x00214268-0x0023dabf
        section, .data: 0x0023dac0-0x002408a3
        section, .gcc_except_table: 0x002408a4-0x0024963b
        section, .got2: 0x0024963c-0x00249657
        section, .ctors: 0x00249658-0x002496c3
        section, .dtors: 0x002496c4-0x0024970f
        section, .jcr: 0x00249720-0x00249723
        section, .sdata: 0x00249728-0x00249aaf
        section, .sbss: 0x00249ab0-0x00249ec4
        section, .bss: 0x00249ee0-0x00255d23
Setting PC with Program Start Address 0x0007c640

XMD% con
Info:Processor started. Type "stop" to stop processor

RUNNING> XMD%

The three commands we used above are:

stop — Stop the processor
rst -processor — Reset the processor
dow <file> — Download an ELF executable to the board and prepare it for running
con — continue

Other useful XMD commands are:

rrd — Dump the r* registers
srrd — Dump the system registers (pc, msr, ctr, etc.)
rwr <register> <word> — Write a register by name
mrd <address> [num] [w|h|b] — Memory Read (default: 'w'ord)
mwr <address> <values> [<num> <w|h|b>] Memory Write (default: 'w'ord)

When connecting to the PowerPC, XMD must be told which memory ranges to map onto memory/DCR/cache. We have been doing this with an initialization file which is executed by XMD on startup. This file may either be called "xmd.ini" and placed in the current directory, or is a user-level file "~/.xmdrc".

.xmdrc

connect ppc hw -debugdevice isocmstartadr 0xFFFFF000 isocmsize 4096 isocmdcrstartadr 0x0000000 icachestartadr 0x10000000 itagstartadr 0x20000000 dcachestartadr 0x30000000 dtagstartadr 0x40000000 dcrstartadr 0x50000000

This will connect to the primary USB cable. There is an option to use a different USB cable: -cable type xilinx_platformusb port usb2[#] where [#] is the USB2 port number you wish to use (see Xilinx & JTAG tools for more details.)

GDB over XMD

We may use GDB to connect to XMD through network socket 1234, giving us a full source-code debugger. This does have limitations in that GDB connected in this way cannot see the full RTEMS task (thread) list.

$ powerpc-rtems4.10-gdb /reg/lab1/home/panetta/dat/foo_install/bin/ppc-rtems-rce405-dbg/core.2.0.devel
...
(gdb) target remote localhost:1234
40      ../../../../../../../../RTEMS/c/src/lib/libbsp/powerpc/virtex4/dlentry/dlentry.S: No such file or directory.
        in ../../../../../../../../RTEMS/c/src/lib/libbsp/powerpc/virtex4/dlentry/dlentry.S
Current language:  auto; currently asm
(gdb) break init_executive
Breakpoint 1 at 0x7d488: file /reg/lab1/home/panetta/dat/core/rceapp/core/devel.cc, line 82.
(gdb) c
Continuing.

Breakpoint 1, init_executive () at /reg/lab1/home/tether/source/checkout/trunk/release/rceapp/core/devel.cc:82
82            printv("RCE core %d.%d.%s", majorVersion, minorVersion, branch.c_str());
(gdb) info thread
warning: RMT ERROR : failed to get remote thread list.

If one wishes a more complete debugger, with full RTEMS thread and task support, we can connect GDB to the RCE over TCP/IP, as described below.

RTEMS GDB stub

Till Straumann of SSRL wrote a nice GDB stub to enable remote debugging of an RTEMS system over a local network. This code has been ported to the RCE system and methods have been developed to use it in conjunction with the RCE dynamic linker developed by Steve Tether.

Requirements

The implementation developed at SSRL requires some patches to GDB in order to run. This patched version of GDB is available via the DAT group's AFS space in /afs/slac/g/cci/package/gnu or in /reg/common/package/gnu/rtems-4.10/bin if AFS is not available. Also, the core code on the RCE must have the GDB stub compiled in and started automatically. This is true as of core 1.3 (commit 1135). Note: the GDB daemons only run on the development core, not in the production core.

Starting out

The GDB process needs to be run from a machine with unrestricted access to the RCE rack. This means that one must run from a machine on the 172.21.6.XXX network, such as rdcds104, or an atca equivalent. This requirement is expected to change with the rationalization of the networking in Lab 1.

Create two windows, one on the GDB host, and one for telnetting to the RCE.

Start GDB against the version of the development core that is on the RCE

powerpc-rtems4.10-gdb build/rceapp/bin/ppc-rtems-rce405-dbg/core.1.2.devel

Connect GDB to the RCE over port 2159 (Ignore the warning. Steve says that this will be normal.)

(gdb) target rtems-remote rce48:2159
Remote debugging using rce48:2159
[New Thread  a01000a]
[Switching to Thread  a01000a]
BREAKPOINT () at /reg/lab1/home/panetta/petacache/rce/gdbstub/rtems-gdb-stub-ppc-shared.h:20
20      }
warning: /reg/lab1/home/panetta/petacache/build/rceapp/bin/ppc-rtems-rce405-dbg/core.1.2.devel: 
'.text' section of executable file doesn't match the target's -- do GDB and the target use the same file?
Current language:  auto; currently c++
(gdb)

At this point, we're sitting in the GDBh thread/task on the RCE, which is stopped as we are connected to it.

  8 Thread  a010009 ('GDBd'  PRI:  20 STATE: ready)  
  7 Thread  a010008 ('TNTD'  PRI:  50 STATE: BLOCKED -  evt)  
  6 Thread  a010006 ('ntwk'  PRI:  80 STATE: ready)  
  5 Thread  a010005 ('ehf0'  PRI:  70 STATE: BLOCKED -  evt)  
  4 Thread  a010004 ('ehr0'  PRI:  80 STATE: BLOCKED -  evt)  
  3 Thread  a010003 ('ehc0'  PRI:  80 STATE: ready)  
  2 Thread  9010001 ('IDLE'  PRI: 255 STATE: ready)  
* 1 Thread  a01000a ('GDBh'  PRI: 200 STATE: stopped - SIGINT)

We can now use breakpoints that are defined in the core development code (such as inside the telnet protocol, or in one of the shell commands.) If we have a task available, we can run it from the shell using runTask. However, as the symbols in this task aren't in memory yet, we cannot yet set a breakpoint inside that task.

Breakpoints in dynamically linked libraries

To set breakpoints in these libraries, we need a bootstrap procedure. Here is one example of a GDB macro which will bootstrap the breakpoint:

define lmbreak
  dont-repeat
  set $lmbreak_file  = "$arg0"
  set $lmbreak_point = "$arg1"
  break RCE::ELF::RunnableModule::run
  commands
    silent
    set logging file /tmp/lmbreak
    set logging on
    printf "add-symbol-file %s 0x%x\n",$lmbreak_file, this + this->textOffset()
    printf "break %s\n",$lmbreak_point
    set logging off
    source /tmp/lmbreak
    shell rm -f /tmp/lmbreak
    cont
  end
end
document lmbreak
Break in an rce loadable module at a specified function, provided as $arg0 and $arg1
Example: 
  lmbreak build/quarks/mod/ppc-rtems-rce405-dbg/testQuarks.1.0.main.so quarks::service::Logger::initLogging
end

This code should be loaded in the user's .gdbinit.

So, to set a breakpoint inside a test program we'll be loading from testQuarks:

(gdb) lmbreak build/quarks/mod/ppc-rtems-rce405-dbg/testQuarks.1.0.main.so quarks::service::Logger::initLogging
Breakpoint 1 at 0x6e618: file elf/RunnableModule.cc, line 17.
(gdb) continue

This says: Set the breakpoint quarks::service::Logger::initLogging using build/quarks/mod/ppc-rtems-rce405-dbg/testQuarks.1.0.main.so as the symbol definitions. Note, the initLogging breakpoint is not set yet, as the symbols aren't extant on the RCE at this point. However, a breakpoint in our dynamic linker's run() method has been defined, and there's where the bootstrap happens.

From a telnet session on the RCE, assuming the proper directories are mounted, we can now run the task:

SHLL [/] # runTask -N QQQQ /build/quarks/mod/ppc-rtems-rce405-dbg/testQuarks.1.0.main.so
1: runTask -N QQQQ /build/quarks/mod/ppc-rtems-rce405-dbg/testQuarks.1.0.main.so
runTask loaded the task to 0x78c6e00.

The task is now loaded, and the shell is now back, but the QQQQ task is stopped:

SHLL [/] # task 
3: task
  ID       NAME           PRI  STATE MODES   EVENTS    WAITID  WAITARG  NOTES
------------------------------------------------------------------------------
0a010003   ehc0            80 READY  P:T:nA    NONE   08424840 0x1fa088 
0a010004   ehr0            80 READY  P:T:nA    NONE   1a010013 0x1fa088 
0a010005   ehf0            70 Wevnt  P:T:nA    NONE                     
0a010006   ntwk            80 READY  P:T:nA    NONE   1a010013 0x1fa088 
0a010008   TNTD            50 Wevnt  P:T:nA    NONE   1a010013 0x1fa088 
0a010009   GDBd            20 Wevnt  P:T:nA    NONE   1a010013 0x1fa088 
0a01000a   GDBh           200 SUSP   P:T:nA    NONE   f6bfd49f 0x1fa088 
0a01000c   RPCd            80 Wevnt  P:T:nA    NONE   28434856 0x1fa088 
0a01002c   pty0            50 READY  P:T:nA    NONE   1a010013 0x1fa088 
0a01002d   QQQQ           100 SUSP   P:T:nA    NONE

If we examine the GDB process window, we see that the new thread 0a1002d has appeared, and we're now stopped in it in the first line of the quarks::service::Logger::initLogging method.

add-symbol-file build/quarks/mod/ppc-rtems-rce405-dbg/testQuarks.1.0.main.so 0x78cce00
break quarks::service::Logger::initLogging
add symbol table from file "build/quarks/mod/ppc-rtems-rce405-dbg/testQuarks.1.0.main.so" at
        .text_addr = 0x78cce00
Breakpoint 2 at 0x78d0bd4: file ../service/src/Logger.cc, line 64.
[New Thread  a01002d]
[Switching to Thread  a01002d]

Breakpoint 2, quarks::service::Logger::initLogging (thresh=quarks::service::Logger::Info, imp=0x11f89b0)
    at ../service/src/Logger.cc:64
64            _threshold = thresh;
(gdb)

At this point, we can continue debugging the task.

One important caveat: Every time you run the task, the new breakpoint must be reloaded using lmbreak. This is because every time the task is run, the code gets loaded at a different point in memory, and the previous set of symbols is no longer valid.

Child pages