Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin

Draft 4.4 (2011 October 19)

The Channels table (Virtex-4)

ports (which limit derives from the limit on the number of plugins). Each type of port has both an official type number and an official short name such as "eth" or "config". Ethernet ports that differ only in speed, e.g., 10 Gb/s ethernet and a 100 Mb/s ethernet have the same port type and short name. Each port has a certain number of virtual channels which may be allocated and deallocated upon request from the user; some types of ports may have as few as one virtual channel while others may have thousands.

A virtual channel is one end of a two-way communications link similar in concept to a BSD socket. The VCs are globally visible but not MT-safe; at a given time at most one thread may be making use of a given virtual channel. Each virtual channel represents a different source and/or sink of data such as a UDP port or a section of Petacache memory. Each has a number that uniquely identifies it amongst all the virtual channels belonging to the same port. When asking for a virtual channel to be allocated the user may specify a specific number that may have meaning at a higher level, e.g., it may map to a UDP port number. Even when asking for a specific virtual channel number no channel may be allocated more than once. The other method of allocation just picks some virtual channel that has not yet been allocated. When finished with a virtual channel the user passes it back to the owning port for deallocation.

To transmit an outgoing message the user requests an the address of an empty message buffer from the virtual channel. After filling this the user then passes buffer address back to the virtual channel for transmission. To receive an incoming message the user makes a request that blocks until a message is available, uses the data at the buffer address eventually supplied then returns the buffer to the virtual channel that supplied it.

At this level each buffer contains the payload of a single inbound or outbound message where the boundaries between messages are respected; in other words the virtual channel implements datagram rather than byte-stream protocols. It's up to high level software such as an IP stack to provide any operations that cross message boundaries. Any message headers or other system overhead are managed by the low-level software.

Low-level

Each message coming in on a port must contain information from which a destination virtual channel can be inferred. If it doesn't, or if it specifies a virtual channel that is invalid or not allocated, then the port's lost-message count is incremented and the message is discarded (its buffer being reclaimed).

A port usually connects to pins leading out of the FPGA on which the RCE resides, though some ports may simply offer access to local resources such as DSPs. No pin may be used by more than one port.

The FPGA is connected to the larger system by data paths called conduits. Each conduit connects to a group of pins on the FPGA; no pin may belong to more than one conduit. At system startup each port that uses pins will be matched to the conduit that connects to exactly the same set of pins. If the resulting mapping is not one-to-one then the startup fails. Each conduit has an associated type number and version number which the software for the matching port checks for validity at boot time. If the software rejects the conduit then again startup will fail.

The lowest level of plugin software is held in relocatable modules recorded on the RCE. A module for plugin type FOO holds:

  • An implementation of a class FooPort derived from the abstract class Port.
  • An implementation of class FooFactory derived from the abstract class Factory. A FooFactory creates FooPort instances which are returned as Port* values.
  • An entry point that creates an instance of FooFactory which is returned as a Factory* value.

Each module is loaded, relocated and bound to the system core before its first use. The core code knows only the abstract base classes Factory and Port, not the derived classes specific to plugin type.

Both plugin hardware and the port-factory modules have version numbers which will allow some measure of compatibility checking at boot time. Any incompatibility detected causes startup to fail.

RCE startup code discovers the set of protocol plugins and conduits available using a set of configuration registers. These registers have their own address space, the "configuration space". For each plugin the configuration space registers yield the plugin type, the plugin version number and the set of FPGA pins connected to it. For each conduit they yield the conduit type, the conduit version number and its set of FPGA pins.

Access to configuration registers is via an object which given an abstract register address reads or writes the contents of the corresponding register. Another object uses the abstract register layer to provide all the configuration info for a plugin or a conduit given its index number (or supplies an indication that the given entity doesn't exist).

Differences between Gen I and Gen II

Gen II

The configuration space registers are hardware memory locations filled with information by the IPMI Controller (IPMC).

Message buffers, once created, are managed entirely by firmware. Payload and headers are managed independently of each other. The firmware manages buffers directly using the buffer addresses. The size of the buffer required for (the payload part of) a message is requested separately for each message.

Gen I

There is no IPMC and there are no hardware configuration registers. The low-level configuration information is burned into a configuration flash container. At boot time the container is read and its contents are used to construct an object which simulates the Gen II configuration register space. Above that layer the handling of the information is just like that in Gen II.

Protocol plugins are assembled from Protocol Interface Core (PIC) blocks which have no counterpart in Gen II. We want to make the configuration information handling like that in Gen II, so rather than extending it with PIC block assignments we sweep those assigments under the rug by embedding them in the plugin software modules.

Control of I/O buffers passes between firmware and application software; a Transfer Descriptor Entry (TDE) is used to pass buffer references in either direction. Frame headers are fully exposed to software which must know for a given plugin the header size, the maximum payload size and the maximum number of buffers for a given plugin. This information can also be embedded in the plugin-handling module at the cost of having to tailor the module to the RCE application. All the buffers for a given plugin (or even a group of related plugins) are of the same size and are preallocated before any messages are sent.

Mid-level API

The class declarations given in this section contain only those members intended for use after system startup is complete and all plugins are on-line. Whether a method is virtual is not specified, nor are friend declarations shown; these are considered implementation details.

Classes and their responsibilities

Table of Content Zone
locationtop
typelist

Concepts

RCE code is divided into components:

  1. The
low-level interface to an RCE's protocol plug-ins uses abstractions called ports, frames, channels, lanes, pipes and factories.

A port is the RCE end of a communications link similar in concept to a BSD socket. Ports are globally visible but not MT-safe; at a given time at most one task may be waiting for data or receiving data from any given port.

Ports deliver data as frames. The exact content of a frame depends on the transmission protocol being used but the port recognizes a broad division into header and payload. One frame corresponds to one message on the I/O medium and is delivered in a single buffer. In other words all ports implement datagram rather than byte-stream protocols. It's up to higher-level software such as a TCP stack to provide any operations that cross frame boundaries. Each port contains a queue of frames which have arrived and not yet been consumed by the application.

  1. core, which is the same for all RCEs of a given generation. It contains low-level processor management code, RTEMS, generic C/C++ support libraries, etc.
  2. The protocol plug-in (PPI) software modules.
  3. Application code which is entered only after the first two components are fully initialized.

Each component's code is normally stored independently somewhere on the RCE, e.g., in configuration flash.

The core and the PPI software modules together offer the use of protocol plugins at different levels of abstraction:

  1. Low-level for RCE startup code and for plugin software modules.
  2. Mid-level for raw device users.
  3. High-level for most applications.

This document describes the first two levels of PPI service interface which uses abstractions called ports, virtual channels, payloads (data), conduits and factories.

Mid-level

A port represents A channel is a hardware I/O engine capable of DMA to and from system RAM, i.e., one firmware instance of a protocol plug-in of a particular type. An RCE has at most eight channels. Most channels will make use of one or more of the Multi-Gigabit Tranceiver modules (lanes) available in firmware, though some may simply offer access to on-board resources such as DSPs. Each channel has its own port space where each port is identified by a 16-bit unsigned integer. Each port represents a different source of incoming data such as a UDP port or a Petacache flash lane. The actual number of ports available on a channel depends on the channel type and may be as low as one.

With one exception there is a one-to-one correspondence between (channel, port no.) and port objectss. The exception is a "catch-all" port, which receives data that doesn't "belong" to any other port in the system. There is a catch-all pseudo-channel available with a limit of one port; creating a port on this channel will create a catch-all port. If no catch-all port exists then an orphan frame is dropped and an error message is placed in the system log.

All the lanes (if any) of a given channel go to one the outputs (pipes) of the rear transition module. This mapping is fixed by hardware.

Each type of channel has both an offical (unsigned) number and an official short name such as "eth" or "config". Either may be used to look up the corresponding Channel object after system startup. Channels that differ only in the number of lanes they use, e.g., 10 Gb/s ethernet (4) and a slower ethernet (1) will have the same channel type, in this case "eth".

The factory code that creates the right kind of Channel and Frame objects for a given type of channel may be already part of the system core or it may be in a container in configuration flash. In the latter case the code must be loaded, relocated and bound to the system core before its first use. An entry point is called in each such loaded factory code module which will register it in a central table using a function exported by the core for this purpose. Pre-loaded code must also be registered using this function. Factory code returns values of Channel* or Frame* though the actual objects pointed to are tailored for the specific channel type.

The information needed to initialize the channels is found in configuration flash and/or by probing the hardware.

Channel information in configuration flash

Data container zero in the configuration flash contains tables providing all the information needed to make all channels ready. This includes references to type-specific factory code but not the code itself; other containers will hold that. In addition the tables provide some extra information not actually needed for setup but required to print a summary of what the protocol plug-ins provide.

The tables are called Channels, Channel Types, Factories, Data Paths, Buffers and Strings. Except for Strings each table is an array of plain-old-data structs with the first field being a key value normally equal to the array index. The keys are there to allow tables to refer to one another's entries. A key equal to 0xffffffff signifies the end of the table in which case none of the other fields in the struct have any guaranteed values and should never be read. Thus you'll generate end-of-table sentinels automatically when you erase a flash block before writing tables into it, provided you leave at least one 32-bit word of unwritten space at the end of each table. If you write all the tables in one go then you must provide the end-of-table markers explicitly.

The Strings table is like an ELF string table; NUL-terminated ASCII strings laid end to end. Certain standard strings such as the short-names of channels are at the front of the table with only one instance of each string present. The other tables refer to a string by giving the offset of its first character in the Strings table.

General layout of the container contents

The first words of the container are the 32-bit offsets in the container to the starts of the tables in the following order:

Offset of Channels

Offset of Channel Types

Offset of Factories

Offset of Data Paths

Offset of Buffers

Offset of Strings

Each of the offsets should be divisible by four and if the corresponding table is present must be greater than zero.

Only the Factories and Buffers tables are needed for Virtex-5,6 RCEs so the other offsets are zero.

After the offsets come the tables themselves. No particular order is required, though since String table entries have variable length it's most convenient to place it last.

To make alignment easier we use 32-bit fields wherever possible, even for 16-bit quantities such as port numbers. All of the declarations for the configuration tables are in namespace RCE::config.

Code Block
nonenone

typedef uint32_t StringOff; // Offset within the Strings table.
Code Block
nonenone

struct Channel {
    uint32_t  key;
    unit32_t  typeKey;        // Ref. to Channel Types table.
    unit32_t  lanesUsed;      // Bit-mask of lanes used.
    uint32_t  blocksUsed;     // Bit-mask of PIC blocks used.
    uint32_t  numPorts;       // Size of port-number space.
    StringOff description;
};

Class name

Instance responsibilities

Port

Represent a single protocol plugin. Allocate and deallocate virtual channels. Deliver data to and from virtual channels. Retain the configuration information for the plugin and the index number of the conduit (if any) assigned to it at startup. Print multi-line reports on the plugin state and configuration.

PortList

Keep a linked list of all Port instances. Assign each an global index number not used by any other Port. Assign each a second index number not used by any Port of the same type. Search the list by global index number, by type and type index number or by conduit number. Print a brief report on the status of all ports, one line per port.

VirtualChannel

Represent a single virtual channel associated with the allocating Port. Accept message payloads for transmission. Return messages that have been received (waiting for them if needed).

Universal constants (constants.hh)

All RCEs whether of Gen I or Gen II each have the same limits on the number of plugin instances (MAX_PLUGINS).

Code Block
none
none

static const unsigned MAX_PLUGINS  =  8;

Port-type enumeration (PortTypes.hh)

The numbers are members of an enumeration assigned by the Data Acquisition Tools (DAT) project.

Code Block
none
none

enum PortType {
    CONFIG_FLASH,
    ETHERNET,
    PGP,
    etc.,
    INVALID_PORT_TYPE
};

The header file also contains a specialization of the template tool::type::EnumInfo which allows one to use the function templates emin<>(), emax<>(), ecount<>(), evalid<>(), enext<>(), eprev<>() and estr<>():

Code Block
none
none

emin<PortType>() == CONFIG_FLASH
emax<PortType>() == PortType(INVALID_PORTTYPE - 1)
ecount<PortType>() == int(INVALID_PORTTYPE)
evalid(PortType x) is true for all from emin() to emax() inclusive, else false
evalid(int) and evalid(unsigned) make similar tests on ints and unsigneds.
enext(emax()) == eprev(emin()) == INVALID_PORTTYPE
enext(CONFIG_FLASH) == ETHERNET, etc.
eprev(ETHERNET) == CONFIG_FLASH, etc.
estr(CONFIG_FLASH) == "CONFIG_FLASH", etc.
estr(x) == "**INVALID**" if and only if evalid(x) is false

ecount<>() can't be used as a dimension for static arrays since the compiler considers it to be non-constant; in that case use EnumInfo<PortType>::count.

Port list (PortList.hh)

This class is a Borg-type singleton; the constructor makes a stateless object whose member functions access the true (shared) state defined elsewhere. The shared state is constructed at system startup. The destructor destroys these stateless objects but does not touch the true state information. You can therefore just use the constructor whenever you need to access the One True List, e.g., PortList().head().

You can get a count of the number of ports or the first port on the list (the list can't be empty). The report() member function will print informational messages in the system log which show the contents of the port list in brief form, one line per port.

Note

The location and form of the system log depends on how the system logging package was initialized at application startup. Client code making log entries is not aware of this initialization.

A particular port may be looked up in several different ways:

  • By its global index number, assigned in sequence starting from zero as ports are created.
  • By its type number and the index number within the type. The first ethernet port would be (ETHERNET,0), the second (ETHERNET,1), etc.
  • By the number of the conduit the port is connected to.

Lookup methods return the null pointer if the search fails.

Code Block
none
none

class PortList {
public:
    PortList() {}
    ~PortList() {}
    int numPorts() const;
    Port* head() const;
    Port* lookup(PortType type, unsigned typeIndex) const;
    Port* lookup(unsigned index) const;
    Port* lookupByConduit(unsigned conduit) const;
    void report() const;
);

Port (Port.hh)

A port object represents a particular instance of a protocol plug-in. Each port object is created at system startup. Port objects live until system shutdown and may not be copied or assigned.

Each port allocates and deallocates VirtualChannel objects on demand. During its lifetime each VirtualChannel object has exclusive use of one of the port's virtual channel numbers; the virtual channel number becomes available again once the VirtualChannel object is deallocated. The application code may request a specific, unused virtual channel number for the type of port, e.g., a well-known TCP port number. The application may also allow the port to assign a number not currently in use by any VirtualChannel.

Every port object is a member of the linked list accessed though class PortList and may not be removed from the list. Use the next() member function to iterate over the list.

A short name for the type and a short description of the port are also provided.

A "lost" counter is provided which counts the number of inbound messages that were discarded, for whatever reason.

Other information provided:

  • The index number of the conduit associated with the port.
  • The hardware version number of the associated plugin.
  • The software version number of the associated plugin module.
  • A bitmask giving the FPGA pins (if any) used by the plugin.
  • The size of the VC-number space.

The report() member function produces detailed multi-line description of the port in the system log, including all platform-specific information.

High-level and mid-level code isn't allowed to create or destroy instances; only low-level code is allowed to do that.

Code Block
none
none

class Port {
public:
    VirtualChannel* allocate(int vcNum);
    VirtualChannel* allocate();
    void deallocate(VirtualChannel *);
    unsigned lost() const;
    unsigned index() const;
    unsigned type() const;
    const char* name() const;
    unsigned typeIndex() const;
    uunsigned conduit() const;
    unsigned versionHard() const;
    unsigned versionSoft() const;
    Port* next() const;
    const char* description() const;
    unsigned maxVcs() const;
    void report() const;
};

Virtual channel (VirtualChannel.hh)

Each VirtualChannel object is allocated by a Port and is assigned a unique ID in the Port's virtual channel number space.

Messages inbound on the associated port may be waited for and retrieved using the receive() member function, which returns a void* pointer to the payload portion of a message. Client code will normally keep a payload for a short time then give it back to the virtual channel they got it from using the virtual channel's deallocate() member function. It's an error to request a virtual channel to deallocate a payload it didn't produce; the result of doing so will be unpredictable.

A virtual channel takes message payloads given to its transmit() member function via void* pointers and queues them for output. The payload pointer must have been produced by the allocate() member function of the same virtual channel (or its receive()); breaking this rule results in unpredictable behavior. Once the message is transmitted the message buffer is automatically deallocated, so the user should not try to use it after calling transmit(). When allocating a buffer the client must specify the maximum size of the payload to be transmitted (in Gen I systems this is ignored since all buffers for a port will be the same size).

Code Block
none
none

class VirtualChannel {
public:
    unsigned vcNum() const;

    void* receive();          // To receive: first call this ...
    void deallocate(void*);   // ... then this.

    void* allocate(size_t payloadSize); // To transmit: first call this (or receive()) ...
    void transmit(void*);               // ... then this.
};

Low level API

In this section we describe the code used to manage configuration information and construct the global PortList. Some of the classes already introduced above will have new members described here; other classes will be completely new.

New members of old classes

The PortList class has a static member function build() whose main purpose is to produce the list of Port instances. To do so it will have to read configuration information about plugins and conduits, load and activate plugin software and match conduits to ports. There is also a member function add() which places a new Port on the end of the list.

Code Block
none
none

class PortList {
  public:
  static void build();
  private:
  void add(Port*);
};

The Port class' constructor is used by PortList::build(). It's constructor and destructor are used by derived classes. Also provided are the means to increment the count of lost messages, to set the next-port member and to find out the set of FPGA pins used by the port.

Code Block
none
none

class Port {
protected:
  Port(unsigned index,
       PortType type,
       const char *name,
       unsigned typeIndex,
       unsigned conduit,
       unsigned versionHard,
       unsigned versionSoft,
       const char* description,
       unsigned long long pins,
       unsigned maxVcs);
   virtual ~Port() = 0;
   unsigned long long pins() const;
   void incLost();
   void next(Port*);
};

Instances of VirtualChannel are created and destroyed only inside Ports. There is an access member added which gives the owning Port instance.

Code Block
none
none

class VirtualChannel {
private:
  VirtualChannel(Port*, unsigned vcNum);
  ~VirtualChannel();
  Port* port() const;
};

New classes and their responsibilities

Class name

Instance responsibilities

ConduitConfig

Hold the configuration information for one conduit.

ConfigReader

Collect all the available information about a given plugin (conduit) from ConfigSpace and put it into an instance of PluginConfig (ConduitConfig). Indicate when the given plugin or conduit doesn't exist.

ConfigSpace

Provide an address space of abstract 32-bit registers containing configuration info for plugins and conduits, whether or not such registers exist in hardware.

PluginConfig

Hold the configuration information for one plugin instance.

PortFactoryList

Hold all PortFactory objects created during system startup. Look up factory instances by type.

Class/enum name

Class/enum responsibilities

ConduitType

Enumerate the different types of conduit.

PortFactory

Abstract base class for objects that given an instance of PluginConfig and an instance of ConduitConfig produce a Port instance. The ConduitConfig is optional for plugins that don't connect to a conduit.

ConduitConfig (ConduitConfig.hh)

This class describes a single conduit.

Member

Description

index

The order of appearance, starting from zero, of the information in ConfigSpace.

type

The type of conduit.

version

The version number of the conduit definition.

pins

Has a 1 bit for each FPGA pin connected to the conduit.

The default constructor creates an invalid instance, one that represents a conduit that doesn't exist. A member function tests whether the instance is a valid one.

Code Block
none
none

class ConduitConfig {
public:
  unsigned index;
  ConduitType type;
  unsigned version;
  unsigned long long pins;
  ConduitConfig();
  ConduitConfig(unsigned ind, ConduitType, unsigned ver, unsigned long long);
  bool isValid() const;
};

ConduitType (ConduitType.hh)

These types are not well defined yet so for now we just define a generic type code. The header also provides a specialization of tool::type::EnumInfo<> similar to that provided for PortType.

Code Block
none
none

enum ConduitType {
  CONDUIT,
  INVALID_CONDUIT_TYPE
};

ConfigReader (ConfigReader.hh)

One member function returns instances of PluginConfig, the other returns instances of ConduitConfig. Both take an argument that is the index of the object whose configuration you want to look up; plugins and conduits are numbered separately starting from zero. The lookups return invalid config instances if the requested entities don't exist.

Instances have no data of their own but get what they need from ConfigSpace; you can generate and throw away instances as often as you want.

Code Block
none
none

class ConfigReader {
  public:
  void lookupConduit(unsigned index, ConduitConfig&) const;
  void lookupPlugin(unsigned index, PluginConfig&) const;
};

ConfigSpace (ConfigSpace.hh)

Each instance implements an abstract space of configuration registers. How the abstract registers are used to collect configuration information is an implementation decision which will however be the same for both Gen I and II. Register addresses start at zero; an attempt to read or write a register at an invalid address, or to write to a read-only register, will throw std::logic_error. Use the implements() member function to determine if a virtual register is implemented at a given address.

Instances have no data of their own but get what they need from some central source on the RCE; exactly where differs between Gen I and Gen II. You can create and destroy instances at will.

Code Block
none
none

class ConfigSpace {
  public:
  ConfigSpace();
  bool implements(unsigned address) const;
  unsigned read(unsigned address) const;
  void write(unsigned address, unsigned value);
};

PluginConfig (PluginConfig.hh)

This class describes a single plugin instance.

Member

Description

index

The global index number of the plugin.

type

The type of Port to make for the plugin.

version

The version number of the plugin definition.

pins

Has a 1 bit for each FPGA pin connected to the plugin.

The default constructor creates an invalid instance, one that represents a plugin that doesn't exist. A member function tests whether the instance is a valid one.

Code Block
none
none

struct PluginConfig {
  unsigned index;
  PortType type;
  unsigned version;
  unsigned long long pins;
  PluginConfig();
  PluginConfig(unsigned ind, PortType, unsigned ver, unsigned long long);
  bool isValid() const;
};

PortFactory (PortFactory.hh)

This is an abstract base class. Once the system startup code knows the types of the available plugins it will load the plugin software module for each type. It will call the entry point of each plugin software module once to obtain an instance of a class derived from PortFactory.

Once it has matched a PluginConfig instance with a ConduitConfig instance, or determines that the plugin needs no conduit, the startup code uses the factory object to create Port instances for the given type of plugin. If the plugin and conduit versions are incompatible the factory member function will throw std::logic_error. It will do the same if no ConduitConfig is supplied when one is required.

The report function will log full details of the factory, including any platform-dependent information.

Code Block
none
none

class PortFactory {
public:
  PortType type() const;
  const char* name() const;
  unsigned version() const;
  const char* description() const;
  Port* makePort(const PluginConfig&);
  Port* makePort(const PluginConfig&, const ConduitConfig&);
  PortFactory* next() const;
  void next(PortFactory*);
  void report() const = 0;
protected:
  PortFactory(PortType, const char* name, unsigned version, const char* description);
  ~PortFactory();
};

PortFactoryList (PortFactoryList.hh)

Another Borg singleton, very similar in concept to PortList. This list of factories is built at about the same time as the list of ports. There is at most one factory per port type. The lookup function returns a null pointer if no matching factory is on the list. The report function logs a one-line summary per factory. New PortFactory instances are added to the end of the list.

Code Block
none
none

class PortFactoryList {
public:
  PortFactoryList();
  PortFactory* head() const;
  unsigned numFactories() const;
  PortFactory* lookup(PortType) const;
  void report() const;
  void add(PortFactory*);
};

Plugin software module interface PluginModule.hh

Each module's entry point is named rce_appmain; this symbol is recognized by the module building system which places its value in the transfer address slot of the module's ELF header. The prototype of the entry point function is

Code Block
none
none

extern "C" PortFactory* rce_appmain();

The system startup code uses class PluginModule to find existing plugin modules and run them, or to save plugin software modules in the usual place given their images in memory. The first constructor fetches the module from some internal RCE storage while the second one reads it from a file. In either case one may then write the module to internal storage or run it to obtain the factory object. Once the module has been run any attempt to write it will throw std::logic_error because the module code will no longer be relocatable.

Code Block
none
none

class PluginModule {
public:
  typedef PortFactory* (*EntryPoint)();
  explicit PluginModule(PortType);
  PluginModule(PortType, const char* filename);
  PortFactory* run();
  void write() const;
};

The module's entry point is run directly without creating a new thread (otherwise we'd need synchronization in order to wait for the factory to be produced).

Gen I-specific initialization

On Gen I RCEs the application software is left with the job of allocating I/O buffers and it can't do that without knowing for each plugin the header sizes, max payload sizes and max number of message buffers for import and export. That information is available from the port factories but it means downcasting the PortFactory* values gotten from the PortFactoryList. TDEs for the inbound buffers must be pushed into one or more FLBs before any data may be received. The interface described here lets the application do the needed initialization without exposing the innards of the plugin-handling system.

Ethernet (TBD)

PGP

PgpSetup (PgpSetup.hh)

Creating an instance of this class performs the following functions:

  • Allocates I/O buffers.
  • Write TDEs to the FLB FIFO.
  • Brings up the MGT links for the requested ports.
  • Resets the appropriate PIC blocks and enables their events.

If you request a PGP port that does not exist then the constructor will throw std::logic_error. If any other initialization fails the constructor will throw std::runtime_error.

Early versions will used cached memory for I/O buffers as has been done in the past. Later versions will used uncached memory. Buffers for inbound messages are shared amongst all PGP ports while each port has its own pool of buffers for outbound messages. The default is to to use all PGP ports and to allocate the maximum number of each kind of buffer with the maximum payload size allowed for the PGP ports. If you request more buffers than the maximum then the maximum number is allocated.

The destructor deallocates all the buffers allocated by the constructor and resets the selected PIC blocks again, this time disabling their events.

Though header sizes and max payload sizes for inbound and outbound messages differ slightly, this class will allocate the maximum for each buffer. Inbound and outbound buffers will differ only in how the length parameter is set in the transaction descriptor since its interpretation varies depending on the direction of transfer.

Code Block
none
none

class PgpSetup {
public:
  enum Port {PORT0=1, PORT1=2, PORT2=4, PORT3=8};
  enum {ALL=-1, DEFAULT=-1};

  explicit SetupPgp
  (int portMask                 =ALL,  // Either ALL or a logical OR of values from enum Port.
   int numInboundBuffers        =ALL,
   int numOutboundBuffersPerPort=ALL,
   int maxPayloadSize           =ALL,
   int resumeThreshold          =DEFAULT  // Either DEFAULT or the FLB resume threshold.
  );

  ~SetupPgp();

  int portMask()                  const;
  int numInboundBuffers()         const;
  int numOutboundBuffersPerPort() const;
  int headerSize()                const;
  int maxPayloadSize()            const;
  int resumeThreshold()           const;
};

Gen I-specific low-level API

Buffer (Buffer.hh)

On Gen I hardware each message is represented by a data structure in main memory called a Transaction Descriptor. The descriptor contains pointers to all the other message-related data:

  • Message header
  • Message payload
  • Transaction completion descriptor.

The allocation and preparation of the descriptor and the other message data is the responsibility of software; the firmware doesn't manage them. For simplicity each message is represented in software by an instance of class Buffer. Each Buffer contains a Transaction Descriptor, Transaction Completion Descriptor, header buffer, payload buffer and next/previous Buffer pointers all welded together into a single object. Each part will have a fixed offset which is hard-coded into the object so that we don't have to refer to non-cached memory just to find out their addresses. For that reason we allocate a fixed number of bytes for the header no matter the type of plugin; the largest header is 32 bytes for the Petacache PGP non-register messages so we'll allocate twice that. The Transaction Descriptor has the strictest alignment requirement (see below) so it comes first. The payload is of variable size so it comes last.

The first word of a Transaction Descriptor is a length parameter whose interpretation depends on the direction of transfer. For outgoing messages it's the number of payload bytes to send. For incoming messages it's the maximum number of header+payload bytes that will be accepted. The plugin never alters the Transaction Descriptor so to see how many bytes were transferred one has to examine the transfer count in the Transaction Completion Descriptor. For reception the completion descriptor is always updated by the plugin while for transmission this happens only when the transaction fails.

A PIC block gives or takes a reference to a Transaction Descriptor in the form of a 32-bit value called a Transaction Descriptor Entry (TDE). Six of those bits are reserved for various flags, so a TDE contains only the upper 26 bits of the address of the descriptor. Perforce the descriptor must be allocated on a 64-byte boundary. The completion descriptor must be allocated on a PowerPC cache line boundary (32-byte boundary). Due to a design quirk PIC blocks write two complete cache lines when updating a completion descriptor, so the space allocated to each must be artificially enlarged.

For details about the descriptors see chapter 4 of the Cluster Element Module document at http://www.slac.stanford.edu/exp/npa/design/CEM.pdf

Note

Each instance is allocated inside a message buffer using placement new so that the member "m_payload" overlaps the first byte of the payload area. Early versions of the plugin software will allocate Buffers in cached memory as has been done in the past; later versions will use non-cached memory.

Code Block
none
none

#include <rtems.h>

#define TRANSACTION_DESCRIPTOR_ALIGNMENT (64)
#define DMA_ALIGNMENT (PPC_CACHE_ALIGNMENT)

struct Buffer {

  enum {
    MAX_HEADER_SIZE  = 64,
    COMPLETION_DESCRIPTOR_SIZE  = 2 * PPC_CACHE_ALIGNMENT
  };

  struct Completion {
    unsigned parameter: 24;
    unsigned wasBlockError: 1;
    unsigned reason: 6;
    unsigned wasError: 1;
    uint32_t transferCount;
  };

  struct Transaction {
    uint32_t    lengthParameter;
    void*       const headerPtr;
    void*       const payloadPtr;
    volatile Completion* const completionPtr;
    Transaction(void* hPtr, void* pPtr, volatile Completion* cPtr);
  };


private:
  Transaction         m_transaction               __attribute__((aligned(TRANSACTION_DESCRIPTOR_ALIGNMENT)));
  Buffer*             m_next;
  Buffer*             m_prev;
  union {
    volatile Completion m_completion              __attribute__((aligned(DMA_ALIGNMENT)));
    uint8_t           m_pad0[COMPLETION_DESCRIPTOR_SIZE];
  } m_paddedComp;
  mutable uint8_t     m_header[MAX_HEADER_SIZE]   __attribute__((aligned(DMA_ALIGNMENT)));
  mutable uint8_t     m_payload;                  __attribute__((aligned(DMA_ALIGNMENT)));

public:
  static Buffer*      transactionToBuffer(Transaction*);
  static Transaction* bufferToTransaction(Buffer*);
  static Buffer*      tdrToBuffer(unsigned tdr);
  static unsigned     bufferToTdr(Buffer*);
  static Buffer*      payloadToBuffer(void*);

public:
  Buffer();
  Buffer*                    next()            const;
  Buffer*                    prev()            const;
  const volatile Completion* completion()      const;
  void*                      header()          const;
  void*                      payload()         const;
  void                       next(Buffer* p);
  void                       prev(Buffer* p);
  unsigned                   lengthParameter() const;
  void                       lengthParameter(unsigned);
};

Platform-specific information (PlatformInfo.hh and PlatformPluginInfo.hh)

The single PlatformInfo instance is basically an array of PlatformPluginInfo instances. The array holds information that has no place in the PortList instance and is searched in a similar manner. The factory classes for Gen I hardware will fill in the information inside their makePort() member functions. Later on such classes as PgpSetup will retrieve it.

The constructor returns a reference to the single instance. {addPlugin()}} adds a new plugin to the array, fills in the port type and index within type, returning a pointer to the new entry. The lookup() member functions behave much like their counterparts in class PortList.

Code Block
none
none

class PlatformInfo {
public:
  PlatformInfo() {}
  const char*         platform    ()                                         const {return "ppc405-rtems";}
  PlatformPluginInfo* addPlugin   (ppi::basic::PortType, unsigned typeIndex);
  PlatformPluginInfo* lookupPlugin(ppi::basic::PortType, unsigned typeIndex);
  PlatformPluginInfo* lookupPlugin(unsigned index);
};

Each entry in the array is a simple struct containing the required information such as header sizes, payload sizes and PIC block assignments for the given port. The limit on import buffers is understood to apply to the FLB assigned to the port rather than the port itself. This is important when multiple ports share a common FLB as is done with PGP.

Code Block
none
none

struct PlatformPluginInfo {
  basic::PortType type;
  unsigned index;
  unsigned typeIndex;
  unsigned importHeaderSize;
  unsigned maxImportPayloadSize;
  unsigned maxImportBuffers;
  unsigned exportHeaderSize;
  unsigned maxExportPayloadSize;
  unsigned maxExportBuffers;
  unsigned pibNum;
  unsigned pebNum;
  unsigned flbNum;
  unsigned ecbNum;
};

Ethernet (TBD)

PGP

Petacache

Port factory class (PetacachePgpFactory.hh)

An instance of this class is what the PGP plugin module will use to create instances of PgpPort. The first makePort() member function is non-functional and will throw std::runtime_error if called. The report() member function will log some messages describing the platform-dependent features of the PGP implementation.

Code Block
none
none

class PetacachePgpFactory: public basic::PortFactory {
public:

  PetacachePgpFactory();

  virtual ~PetacachePgpFactory();

  virtual PgpPort* makePort(unsigned index, unsigned typeIndex, const basic::PluginConfig&);

  virtual PgpPort* makePort(
    unsigned index,
    unsigned typeIndex,
    const basic::PluginConfig& plugin,
    const basic::ConduitConfig& conduit);

  virtual void report() const;

};

Gen I PGP plugins all share the same ECB and FLB but each is assigned its own PEB and PIB. The index you pass to the accessor function is the index within the plugin type PGP, the same number you would obtain from Port::typeIndex().

Gen I initialization of ConfigSpace

Configuration container zero in the configuration flash contains tables of information about the hardware and firmware; this information can't be gotten directly from the hardware and firmware.

Not knowing the actual layout of the RCE's circuit board(s) I've arbitrarily assigned conduit 0 to the 10 Gb ethernet and conduits 1, 2, 3 and (for petacache) 4 to plugin type PGP. The four MGT's assigned to the ethernet I assign to pins 0-3 while the ones for PGP I assign to pins 4, 5, 6 and (peta) 7.

The container contents consist of eight instances of PluginConfig followed immediatly by eight instances of ConduitConfig, except that the "index" members are not recorded. For a Petacache RCE board (PGP only) the tables look like this: (substitute 0xffffffff for "EMPTY" and ):

Type

Version

Pins

PGP

1

0x00000000 00000010

PGP

1

0x00000000 00000020

PGP

1

0x00000000 00000040

PGP

1

0x00000000 00000080

EMPTY

0

0x00000000 00000000

EMPTY

0

0x00000000 00000000

EMPTY

0

0x00000000 00000000

EMPTY

0

0x00000000 00000000

Type

Version

Pins

CONDUIT

1

0x00000000 00000010

CONDUIT

1

0x00000000 00000020

CONDUIT

1

0x00000000 00000040

CONDUIT

1

0x00000000 00000080

EMPTY

0

0x00000000 00000000

EMPTY

0

0x00000000 00000000

EMPTY

0

0x00000000 00000000

EMPTY

0

0x00000000 00000000

The EMPTY value must be one that is is not valid for conversion to either PortType or ConduitType; 0xffffffff will do. Replace "PGP" and "CONDUIT" with the numerical values of the corresponding enumerators.

Gen I storage of plugin software modules

Only a few different types of protocol plugins are found on Gen I systems so each type is assigned to a fixed Configuration container:

Plugin type

Container name

ETHERNET

1

PGP

2

Future expansion

3-9

Later a container may be used for CONFIG_FLASH, if we ever manage to make a usable wrapper for the FCI package that makes it look like another plugin.

Use case: Gen I booting

  1. Boot code:
    1. Loads and starts the system core.
  2. System core
    1. Initializes the CPU.
    2. Initializes RTEMS.
    3. Initializes any extra C++ support.
    4. Sets up the MMU's TLB and enables the MMU.
    5. Creates the default instance of the dynamic linker.
    6. Reads ConfigSpace.
      1. The first read triggers the reading of Configuration container zero.
      2. Builds the Port and PortFactory lists, reading, linking and running plugin software modules as required.
    7. Performs other initialization, e.g., ethernet, BSD stack.
    8. Loads and links the application code using the default dynamic linker.
    9. Calls the application entry point.

Source code organization

All the classes, enums and other declarations will appear within the top-level namespace ppi and in namespaces nested within it.

The Channel Types table (Virtex-4)

Code Blocknonenone

struct ChannelType {
    uint32_t  key;
    uint32_t  officialNumber;
    StringOff officialName;
    StringOff description;
};

The Factories table (Virtex-4,5,6)

In the Factories table a container name of 0xffffffff marks pre-loaded factory code; otherwise the name is used to find the required container as specified in the RCE document.

Code Blocknonenone

struct Factory {
    uint32_t  key;
    uint32_t  channelTypeKey;
    uint32_t  containerName;
    StringOff description;
};

The Data Paths table (Virtex-4)

The lanes (MGTs) allocated to a channel will all feed a particular output (pipe) on the backplane. Those channels that have no lanes will not appear in this table.

Code Blocknonenone

struct DataPath {
    uint32_t  key;
    uint32_t  channelKey;
    uint32_t  pipeNum;    // Where the channel's signals "come out."
};

The Buffers table (Virtex-4,5,6)

Each channel comes with a recommendation for the number and size (in bytes) of buffers to be allocated in non-cached memory. Those recommendations are in this table. Note that these are only recommendations ; the system initialization procedure needs to consider all the recommendations together along with the amount of memory actually available.

Code Blocknonenone

struct Buffer {
    uint32_t key;
    uint32_t channelKey;
    uint32_t bufferCount;
    uint32_t bufferSize;
};

Example configuration tables

Here's what the tables would look like for an Virtex-4 RCE that defines nothing but an ethernet LAN and configuration flash, which are standard for all RCEs. We assume that the ethernet uses lanes 0-3 and feeds pipe zero.

The Channels table:

Key

Type key

Lanes

Ports

Firmware version

Description

0

0

0xf

65536

1

"LAN"

1

1

0

1

1

"Configuration flash"

2

2

0

1

1

"Catch-all channel"

Ethernet allows 65536 ports so as not to restrict the port space of UDP or TCP.

The Channel Types table:

Key

Offical no.

Official name

Description

0

0

"eth"

"ethernet"

1

1

"config"

"Configuration flash"

2

2

"catch-all"

"Catch-all channel"

The Factories table:

Key

Type key

Container name

Description

0

0

<name1>

"Virtex-4 ethernet"

1

1

0xffffffff

"Virtex-4 configuration flash"

2

2

0xffffffff"

"Virtex-4 catch-all channel"

The Data Paths table:

Key

Channel key

Pipe no.

0

0

0

The Buffers table:

Key

Channel key

No. of buffers

Buffer size

0

0

32

1518

1

1

16

2048

Here we assume that the firmware delivers all ethernet framing bytes as well as the payload and that jumbo frames are not allowed. We assume that the formware delivers configuration flash data in units of pages and that a page is 2K bytes long.

How the tables appear in RAM

We copy the tables without alteration and set the members of the following structure:

Code Blocknonenone

struct Tables {
    unsigned numChannels;
    Channel *channel;      // Pointer to first element of table.
    unsigned numChannelTypes;
    ChannelType *channelType;
    unsigned numFactories;
    Factory *factory;
    unsigned numDataPaths;
    DataPath *dataPath;
    unsigned numBuffers;
    Buffer *buffer;
};

Use case: System startup for Virtex-4

  1. Boot code
    1. Loads and starts the system core.
  2. System core
    1. Initializes the CPU.
    2. Initializes RTEMS.
    3. Initializes any extra C++ support.
    4. Sets up the MMU's TLB and enables the MMU.
    5. Registers the factories for configuration flash and catch-all.
    6. Creates the configuration flash channel.
    7. Allocates uncached RAM for config. flash pages.
    8. Enables the configuration flash channel.
    9. Copies the Channels table and related info from configuration flash into RAM.
    10. Creates the default instance of the dynamic linker.
    11. Loads and links the factory code marked as loadable in the Factories table.
    12. Calls all factory code entry points so that all factories are registered.
    13. Walks the Channels table and creates all Channel objects.
    14. Walks the Buffers table and allocates uncached memory for I/O buffers.
    15. Enables all Channel objects not already enabled.
    16. Loads and links the application code using the default dynamic linker.
    17. Initializes the network.
      1. Initializes each ethernet channel.
      2. Initializes IP, UDP, TCP and BSD sockets.
      3. Gets a DHCP lease if required.
    18. Calls the application entry point.
  3. Factory code entry point.
    1. Registers with the core two functions that create the right kind of Channel and Factory objects.

Use case: Frame import

Prior to this the code wishing to use the port has associated a consumer task with it. That task is blocked or idling waiting for new frames.

  1. Plug-in
    1. Writes the frame via DMA.
    2. Causes a frame-arrival interrupt.
  2. ISR
    1. Reads the cause-of-interrupt register and passes its value to the I/O dispatcher task.
  3. I/O dispatcher task
    1. Finds the Channel objects based on cause-of-interrupt
    2. Channel object
      1. Gets the descriptor for the frame from the right firmware queue.
      2. Builds a Frame object of the right kind using the descriptor.
    3. Frame object
      1. Examines the frame data to determine the right port.
    4. Port object
      1. Enqueues the frame.
      2. Wakes up the consumer task if necessary.
  4. Consumer task
    1. Consumes at least one of the enqueued frames.
    2. Yields or blocks.

Running the Channel, Frame and Port object code in a dispatcher task rather than the ISR keeps the latter simple; it doesn't have to know about system data structures, just hardware. Simplicity should translate to speed of response. The dispatcher task can be normal C++ code that runs with the MMU on, which an ISR by default doesn't.

Sub-case: Full cooperative multitasking

We assign the same priority to the I/O dispatcher task and all the consumer tasks. All of these tasks remain in the ready queue; each puts itself at the back of ready queue (yields) when it finds that its input queue is empty or after it has completed a certain amount of work. The consumer tasks and the I/O dispatcher task don't need to synchronize their communication because no one of them can preempt another.

The communication between the ISR and the I/O dispatcher task needs some synchronization since the ISR can preempt the dispatcher task at any time. We need not resort to semaphores or other locks, though; there are relatively simple lock-free algorithms we can use. The dispatcher task, once it comes to the front of the ready queue, loops until it manages to read its input queue without detecting interference from the ISR, then either yields immediately if the queue was empty or after performing one or more dispatches. As an alternative the dispatcher can disable interrupts for the short time it takes to check its queue.

Sub-case: Preemptive multitasking

If the I/O dispatcher is given a higher priority than consumer threads then it must synchronize its communication with them. It also won't just be able to yield when it has nothing to do since it will go onto what amounts to a different ready queue from the consumer tasks, one that is examined first. The dispatcher task would always run, starving the consumers. The dispatcher task would have to actually block itself and the ISR would have to unblock it.

If the consumer tasks don't all have the same priority then they too will have to block themselves and be unblocked by the dispatcher task.

If we use time-slicing then we don't have to be so careful in deciding when each task should yield but then all inter-task communication will require synchronization. If priority assignment is non-uniform then we need to use explicit blocking and unblocking instead of yielding.

Use case: Frame export

Low-level API

Official RCE channel type numbers and names

These are given in a header file made available to both core and application code. The numbers are members of an enumeration and the names are given in a static array.

Code Blocknonenone

namespace RCE::channel {

    enum {
        ETHERNET,
        CONFIG_FLASH,
        CATCH_ALL,
        ...
    } Numbers;

    const char *names = {
        "eth",
        "config",
        "catch-all"
        ...
    };
}

Channel and Frame factories

These are the abstract base classes.

Code Blocknonenone

class ChannelFactory {
public:
    ChannelFactory();
    virtual Channel* create(
        unsigned ident,   // Assigned by init. code.
        unsigned channelsTableKey,
        RCE::config::Tables&) = 0;
    virtual ~ChannelFactory();
};

Channel type registry

Code Blocknonenone

class ChanneTypeRegistry {

    ChannelTypeRegistry();

    // Register a Channel factory using its official RCE type number.
    void register(
        unsigned officialNumber,
        ChannelFactory &,
    );

    ChannelFactory& channelFactory(unsigned officialNumber);

    unsigned officialNumber(const char *officialName);
);

An instance of this class will be exported by the core code using some design pattern such as Singleton or functional equivalent.

Class Frame

  • Instance responsibilities.
  1. Ownership of a frame buffer. Once created a Frame instance must be kept alive as long as the frame buffer is in use by the client code. The destructor of the Frame subclass will return buffers to the I/O pool. For Virtex-4 each Channel instance has its own I/O pool; for Virtex-5,6 the pool is global.
  2. Furnish on request the locations and sizes of the following sections of the frame buffer:
    1. Links.
    2. Descriptor.
    3. Status.
    4. Header.
    5. Payload.
  3. Calculate a channel port number from the header contents.

Frame instances are created by Channel instances.

Channel

Code Blocknonenone

class Channel {
  // Class responsibilities
  // ----------------------
  // Maintain the following mappings:
  // 1) PIC block number to Channel (1-to-1).
  // 2) Global Channel ID to Channel (1-to-1).
  // 3) ChannelClass to Channel (1-to-many).
  //
  // Instance responsibilities
  // -------------------------
  // Create and destroy Ports. Allocate and free port numbers. Maintain
  // the following mappings:
  // 1) port number to Port (1-to-1).
  // Give out the minimum frame buffer size and the recommended number of
  // frames for the channel.
public:
  // Install a new channel of the given class and assign it the next available
  // ID number. Associated with the class are two DCR addresses, one for
  // imported frames and one for exported frames. Presumably these are the
  // DCRs used to dequeue and enqueue frame descriptors, respectively.
  // Using this constructor automatically makes the object findable
  // by any of the static member functions that perform searches.
  Channel(const ChannelClass &ctype,
          unsigned importDcr,
          unsigned exportDcr);

  // Return the ID assigned by the constructor.
  unsigned id() const;

  // Return the DCR address associated with importing frames.
  unsigned importDcr() const;

  // Return the DCR address associated with exporting frames.
  unsigned exportDcr() const;

  // Return the class of the channel.
  const ChannelClass &channelClass() const;

  // Create a new Port object for this Channel, assigning its
  // global and port nos.
  void createPort();

  // Destroy a Port. It is an error to give a predefined Port or
  // a Port not belonging to this Channel (throws std::logic_error).
  void destroyPort(Port *);

  // Return the Port associated with the port number, or return 0.
  Port *findPort(unsigned) const;

  // Examine the Frame header to see which of this Channel's Ports
  // the Frame should belong to. If such a Port is found then
  // enqueue the frame on it and return true. If no matching Port
  // can be found then return false.
  virtual bool enqueueFrame(Frame*) const;

  // The minimum Frame buffer size for this channel.
  virtual unsigned frameSize() const;

  // The recommended number of Frame buffers to allocate for this channel.
  virtual unsigned numberOfFrames() const;

  // Return the Channel which has the given ID, or return 0.
  static Channel* findChannel(unsigned channelId);

  // Return the m'th Channel that belonging to the given class,
  // assuming that Channels are tested in ascending order by ID
  // numbers. The first matching Channel corresponds to m=0, etc. Return 0
  // if no such channel exists.
  static Channel* findChannel(const ChannelClass &ctype,
                              unsigned m);

  // Return the Channel with the given DCR address, or return 0.
  static Channel* findChannelByDcrAddress(unsigned);

private:

  // Does nothing because Channels last until the next system shutdown.
  virtual ~Channel();

};


class Port {
  // Class responsibilities
  // ----------------------
  // Maintain a mapping of Port IDs to Ports (1-to-1).
  //
  // Instance responsibilities
  // -------------------------
  // Maintain a queue of Frames. Manage a consumer task, letting it
  // put itself to sleep if the queue is empty and waking it up if
  // needed when new Frames are queued.
public:

  // Return local number local to the Channel owning this Port.
  unsigned localPortNum() const;

  // Return the Channel that owns this Port.
  Channel *channel() const;

  // Put Frames on the queue maintained by this Port. Wake up
  // any consumer task if needed. The argument is the first
  // member of a list of Frames to enqueue.
  void enqueueFrames(Frame*);

  // Return a pointer to a Frame which is the first in a list
  // of Frames removed from the queue. If no Frames are available
  // then block set the consumer task ID and block until at least one
  // Frame is ready.
  // TDB: Needs a way to indicate that the Port has been
  // destroyed (a special Frame instance? Frame flags?)
  Frame *dequeueFrames();

  // Destroy this Port by calling upon the Channel that owns it.
  void destroy();

  // Return the Port with the given global ID, or return 0.
  Port *findPort(unsigned);

private:
  friend class Channel;

  // Create a port for the given Channel and port. The next available
  // global Port ID number is automatically assigned.
  Port(Channel *owner, unsigned localPortNum);

  ~Port();

};

Frames