You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 21 Next »

Concepts

The low-level interface to an RCE's protocol plug-ins uses abstractions called ports, frames, channels, lanes, pipes and factories.

A port is the RCE end of a communications link similar in concept to a BSD socket. Ports are globally visible but not MT-safe; at a given time at most one task may be waiting for or receiving data from any given port.

Ports deliver data as frames. The exact content of a frame depends on the transmission protocol being used but the port recognizes a broad division into header and payload. One frame corresponds to one message on the I/O medium and is delivered in a single buffer. In other words all ports implement datagram rather than byte-stream protocols. It's up to higher-level software such as a TCP stack to provide any operations that cross frame boundaries. Each port contains a queue of frames which have arrived and not yet been consumed by the application.

A channel is a hardware I/O engine capable of DMA to and from system RAM, i.e., a protocol plug-in. An RCE has at most eight channels. Most channels will make use of one or more of the Multi-Gigabit Tranceiver modules (lanes) available in firmware, though some may simply offer access to on-board resources such as DSPs. Each channel has its own "local port" space where each local port is identified by a 16-bit unsigned integer. Each local port represents a different source of incoming data such as a UDP port or a Petacache flash lane. The actual number of ports available on a channel depends on the channel type and may be as low as one.

With one exception there is a one-to-one correspondence between (channel, local port) and ports. The exception is a "catch-all" port, which receives data that doesn't "belong" to any other port in the system. There is a catch-all pseudo-channel available with a limit of one port; creating a port on this channel will create the catch-all port. If no catch-all port exists then an orphan frame is dropped and an error message is placed in the system log.

All the lanes (if any) of a given channel go to one the outputs (pipes) of the rear transition module. This mapping is fixed by hardware.

Each kind of channel has associated factory code that can construct instances of the correct class derived from Channel and of the correct class derived from Frame, returning values of type Channel* and Frame*, respectively. Factory code for a given kind of channel is either a part of the system core image or located in one container in configuration flash. There is an channel uber-factory defined which given the channel type as a parameter finds and calls the type-specific channel factory. Channel type instances in turn call the type-specific frame factory.

All the information needed to initialize the channels is found in configuration flash (Virtex-4) or by interrogating the hardware (Virtex-5,6).

Virtex-4 channel information in configuration flash

Data container zero in the configuration flash contains tables providing all the information needed to make all channels ready. This includes references to type-specific factory code but not the code itself; other containers will hold that. In addition the tables provide some extra information not actually needed for setup but required to print a summary of what the protocol plug-ins provide.

The tables are called Channels, Channel Types, Factories, Data Paths and Strings. Except for Strings each table is an array of plain-old-data structs with the first field being a key or ID field normally equal to the array index. An ID equal to 0xffffffff signifies the end of the table in which case none of the other fields in the struct have any guaranteed values and should never be read. Thus you'll generate end-of-table sentinels automatically when you erase a flash block before writing tables into it, provided you leave at least one 32-bit word of unwritten space at the end of each table. If you write all the tables in one go then you must provide the end-of-table markers explicitly.

The Strings table is like an ELF string table; NUL-terminated ASCII strings laid end to end. Certain standard strings such as the short-names of channels are at the front of the table with only one instance of each string present. The other tables refer to a string by giving the offset of its first character in the Strings table.

General layout of the container contents

The first words of the container are the 32-bit offsets in the container to the starts of the tables in the following order:

Offset of Channels

Offset of Channel Types

Offset of Factories

Offset of Data Paths

Offset of Strings

Each of the offsets should be divisible by four.

After that come the tables themselves. No particular order is required, though since String table entries have variable length it's most convenient to place it last.

To make alignment easier we use 32-bit fields wherever possible, even for 16-bit quantities such as port numbers.

typedef uint32_t StringOff; // Offset within the Strings table.

The Channels table

struct Channel {
    uint32_t id;              // Unique ID.
    unit32_t  typeId;         // Ref. to Channel Types table.
    unit32_t  lanesUsed;      // Bit-mask of lanes used.
    uint32_t  numPorts;       // Size of port-number space.
    StringOff description;
};

Actual ID numbers are assigned when channel objects are created; the id fields here is to allow other tables to refer to this one.

The Channel Types table

struct ChannelType {
    uint32_t  id;          // Unique type ID.
    uint32_t  factoryId;   // Reference to Factories table.
    StringOff shortName;
    StringOff description;
};

The Factories table

Struct Factory {
    uint32_t  id;
    uint32_t  containerName;
    StringOff description;
};

The Data Paths table

struct DataPath {
    uint32_t  id;
    uint32_t  channelId; // Ref. to Channels table.
    uint32_t  pipeId;    // Where the channel's signals "come out."
};

Example configuration tables

Here's what the tables would look like for an Virtex-4 RCE that defines nothing but the standard facilities that all RCEs have regardless of application:

The Channels table:

ID

Type ID

Lanes

Ports

Description

0

0

0xf

65536

"LAN"

1

1

0

1

"Configuration flash"

The Channel Types table:

ID

Factory ID

Short name

Description

0

0

"eth"

"ethernet"

1

1

"config"

"Configuration flash"

The Factories table:

ID

Container name

Description

0

<name1>

"Virtex-4 10 Gb/s ethernet"

1

<name2>

"Virtex-4 configuration flash"

Use case: System startup

  1. Boot code
    1. Loads and starts the system core.
  2. System core
    1. Initializes the CPU.
    2. Initializes RTEMS.
    3. Initializes any extra C++ support.
    4. Initializes its I/O tables so that only the catch-all and its
      predefined port are registered.
    5. Creates and registers the Channel and Port objects for the configuration flash.
    6. Turns on the MMU.
    7. Allocates uncached RAM for config. flash frames, based on the minimum buffer size and buffer count values returned by the driver.
    8. Associates producer and consumer tasks with the config. flash port.
    9. Copies the port and channel configuration tables into RAM.
    10. Creates the default instance of the dynamic linker.
    11. Loads each device driver into the default dynamic linker.
    12. Links and binds all drivers (except for configuration flash driver, which is
      part of the core).
    13. Calls rce_appmain() for each driver.
    14. Sets aside a region of uncached memory and allocates import buffers
      within it. The size of the region and the sizes and numbers of buffers is based on the amount of RAM available and the minimum buffer size and count values returned by each Channel object (except config. flash).
    15. Loads the application code and binds it into the default linker.
    16. Initializes the network.
      1. Initializes each ethernet channel.
      2. Initializes IP, UDP, TCP and BSD sockets.
      3. Gets a DHCP lease if required.
    17. Calls the application rce_appmain().
  3. Driver rce_appmain()
    1. Creates an instance of a class derived from Channel, which automatically registers the instance in the core tables. The instance also creates and registers any predefined Ports.
  4. Application rce_appmain()
    1. Creates port consumer and producer tasks and associates them with their Ports.

Use case: Frame import

Prior to this the code wishing to use the port has associated a consumer task with it. That task is blocked or idling waiting for new frames.

  1. Plug-in
    1. Writes the frame via DMA.
    2. Causes a frame-arrival interrupt.
  2. ISR
    1. Determines which plug-in caused the interrupt (DCR address).
    2. Passes the DCR naddress to the I/O dispatcher task.
  3. I/O dispatcher task
    1. Finds the Channel object for the given DCR address.
    2. Channel object
      1. Gets the descriptor for the frame from the right firmware queue.
      2. Determines the port no. for this frame.
      3. Finds the right Port object.
    3. Port object
      1. Enqueues the frame.
      2. Wakes up the consumer task if necessary.
  4. Consumer task
    1. Consumes at least one of the enqueued frames.
    2. Yields or blocks.

Running the Channel and Port object code in a dispatcher task rather than the ISR keeps the latter simple; it doesn't have to know about system data structures, just hardware. Simplicity should translate to speed of response. The dispatcher task can be normal C++ code that runs with the MMU on, which an ISR by default doesn't.

Sub-case: Full cooperative multitasking

We assign the same priority to the I/O dispatcher task and all the consumer tasks. All of these tasks remain in the ready queue; each puts itself at the back of ready queue (yields) when it finds that its input queue is empty or after it has completed a certain amount of work. The consumer tasks and the I/O dispatcher task don't need to synchronize their communication because no one of them can preempt another.

The communication between the ISR and the I/O dispatcher task needs some synchronization since the ISR can preempt the dispatcher task at any time. We need not resort to semaphores or other locks, though; there are relatively simple lock-free algorithms we can use. The dispatcher task, once it comes to the front of the ready queue, loops until it manages to read its input queue without detecting interference from the ISR, then either yields immediately if the queue was empty or after performing one or more dispatches. As an alternative the dispatcher can disable interrupts for the short time it takes to check its queue.

Sub-case: Preemptive multitasking

If the I/O dispatcher is given a higher priority than consumer threads then it must synchronize its communication with them. It also won't just be able to yield when it has nothing to do since it will go onto what amounts to a different ready queue from the consumer tasks, one that is examined first. The dispatcher task would always run, starving the consumers. The dispatcher task would have to actually block itself and the ISR would have to unblock it.

If the consumer tasks don't all have the same priority then they too will have to block themselves and be unblocked by the dispatcher task.

If we use time-slicing then we don't have to be so careful in deciding when each task should yield but then all inter-task communication will require synchronization. If priority assignment is non-uniform then we need to use explicit blocking and unblocking in this case as well.

Use case: Frame export

Low-level API

class Channel {
  // Class responsibilities
  // ----------------------
  // Maintain the following mappings:
  // 1) DCR register address to Channel (2-to-1).
  // 2) Global Channel ID to Channel (1-to-1).
  // 3) ChannelClass to Channel (1-to-many).
  //
  // Instance responsibilities
  // -------------------------
  // Create and destroy Ports. Allocate and free port numbers. Maintain
  // the following mappings:
  // 1) port number to Port (1-to-1).
  // 2) Frame header information to port number (many-to-1) (device
  //    dependent).
  // Give out the minimum frame buffer size and the recommended number of
  // frames for the channel.
public:
  // Install a new channel of the given class and assign it the next available
  // ID number. Associated with the class are two DCR addresses, one for
  // imported frames and one for exported frames. Presumably these are the
  // DCRs used to dequeue and enqueue frame descriptors, respectively.
  // Using this constructor automatically makes the object findable
  // by any of the static member functions that perform searches.
  Channel(const ChannelClass &ctype,
          unsigned importDcr,
          unsigned exportDcr);

  // Return the ID assigned by the constructor.
  unsigned id() const;

  // Return the DCR address associated with importing frames.
  unsigned importDcr() const;

  // Return the DCR address associated with exporting frames.
  unsigned exportDcr() const;

  // Return the class of the channel.
  const ChannelClass &channelClass() const;

  // Create a new Port object for this Channel, assigning its
  // global and port nos.
  void createPort();

  // Destroy a Port. It is an error to give a predefined Port or
  // a Port not belonging to this Channel (throws std::logic_error).
  void destroyPort(Port *);

  // Return the Port associated with the port number, or return 0.
  Port *findPort(unsigned) const;

  // Examine the Frame header to see which of this Channel's Ports
  // the Frame should belong to. If such a Port is found then
  // enqueue the frame on it and return true. If no matching Port
  // can be found then return false.
  virtual bool enqueueFrame(Frame*) const;

  // The minimum Frame buffer size for this channel.
  virtual unsigned frameSize() const;

  // The recommended number of Frame buffers to allocate for this channel.
  virtual unsigned numberOfFrames() const;

  // Return the Channel which has the given ID, or return 0.
  static Channel* findChannel(unsigned channelId);

  // Return the m'th Channel that belonging to the given class,
  // assuming that Channels are tested in ascending order by ID
  // numbers. The first matching Channel corresponds to m=0, etc. Return 0
  // if no such channel exists.
  static Channel* findChannel(const ChannelClass &ctype,
                              unsigned m);

  // Return the Channel with the given DCR address, or return 0.
  static Channel* findChannelByDcrAddress(unsigned);

private:

  // Does nothing because Channels last until the next system shutdown.
  virtual ~Channel();

};


class Port {
  // Class responsibilities
  // ----------------------
  // Maintain a mapping of Port IDs to Ports (1-to-1).
  //
  // Instance responsibilities
  // -------------------------
  // Maintain a queue of Frames. Manage a consumer task, letting it
  // put itself to sleep if the queue is empty and waking it up if
  // needed when new Frames are queued.
public:

  // Return local number local to the Channel owning this Port.
  unsigned localPortNum() const;

  // Return the Channel that owns this Port.
  Channel *channel() const;

  // Put Frames on the queue maintained by this Port. Wake up
  // any consumer task if needed. The argument is the first
  // member of a list of Frames to enqueue.
  void enqueueFrames(Frame*);

  // Return a pointer to a Frame which is the first in a list
  // of Frames removed from the queue. If no Frames are available
  // then block set the consumer task ID and block until at least one
  // Frame is ready.
  // TDB: Needs a way to indicate that the Port has been
  // destroyed (a special Frame instance? Frame flags?)
  Frame *dequeueFrames();

  // Destroy this Port by calling upon the Channel that owns it.
  void destroy();

  // Return the Port with the given global ID, or return 0.
  Port *findPort(unsigned);

private:
  friend class Channel;

  // Create a port for the given Channel and port. The next available
  // global Port ID number is automatically assigned.
  Port(Channel *owner, unsigned localPortNum);

  ~Port();

};

Frames

  • No labels