Presentations/Talks/Posters

Introduction

Modern High Energy Physics and Photon Science experiments now require high-speed serial data links of 10 Gbps and higher for their data acquisition systems. These links must be lightweight, with low protocol overhead and small FPGA resource utilization. Often these links will multiplex data from several sources, so the link protocol should also support the idea of multiple “Virtual Channels” per physical link. Based on our experience developing the PGP2 protocol, supported link rates of up to 6 Gbps using 8b/10b encoding, we have now developed the PGP3 protocol to support link rates in excess of 10 Gbps using 64b/66b encoding.

Link Layer

PGP3 uses 64b/66b encoding to achieve DC balance of the serial data stream. Each 64-bit word is scrambled with a source synchronous scrambler with polynomial G(x)=x58+x29+1. Two bits are then appended to each word, 0b01 to mark regular data, and 0b10 to mark control characters (K-Codes). This ensures that a transition between 0 and 1 at least once every 66 bits. It is also used for word alignment. These 66-bit words can then be serialized and deserialized using the high speed transceivers found in modern FPGAs. The protocol does not specify any link rates, and any link speed may be targeted provided that the link medium and FPGA on each side can support it.

K-Codes

The protocol defines several K-Codes to indicate data framing, flow-control, opcodes, and other metadata. For all K-Codes, the most significant 8 bits of the 64-bit word indicate which code it is. This is known as the Block Type Field (BTF). The lower 56 bits are then specified differently depending on the K-Code.

K-Code nameBTF
IDLE0x99
SOF (Start of Frame)0xAA
EOF (End of Frame)0x55
SOC (Start of Cell)0xCC
EOC (End of Cell)0x33
SKIP0x66
USER00x78
USER10x87
USER20x2D
USER30xD2
USER40x1E
USER50xE1
USER60xB4
USER70x4B

LINKINFO Structure

Flow control in performed on a per virtual channel basis. Each received Virtual Channel is expected to be separately buffered in external logic, with PAUSE and OVERFLOW signals from the buffer fed back to the PGP3 block. The PAUSE signal indicates that the buffer has less than Cell (128 words) of space remaining. The receive buffer status of all Virtual Channels is grouped into a 40-bit LINKINFO structure, which is included in each IDLE, SOF and SOC code that is transmitted. The maximum cell size of 128 words grantees that any change in buffer fill status will be transmitted back upstream within at most 128 word-clock cycles

Bit(s)Name
0-15VC 0-15 PAUSE
16-31VC 0-15 OVERFLOW
32RXREADY
33-35PGP Version (Always 0x3)
36-39Reserved (zeros)

K-Code: SKIP

SKIP codes are sent once every 5000 words. They are used to mitigate clock drift between the oscillators on either side of the link. SKIP characters are not written into the elastic buffer in the receive logic, allowing the buffer to avoid overflows when the transit clock on one side of the link is slightly faster than the receive clock on the other.  The lower 56-bit data field is called "RemoteLinkData" and used to publish status data between the two end-point with high latency with no guarantee of transmission.  RemoteLinkData is intended for high level, slow changing status bit communication (Example: Board ID Number).

Bit(s)Name
0-55RemoteLinkData
56-63BTF = 0x66

K-Code: IDLE

IDLE codes are sent when the transmit logic has nothing else to send, or when flow control indicates that the downstream side is unable to receive data.

Bit(s)Name
0-39LINKINFO
40-55Reserved (zeros)
56-63BTF = 0x99

K-Code: SOF/SOC

SOF (Start of Frame) or SOC (Start of Cell) codes are sent when the start of data payload transmission

Bit(s)Name
0-39LINKINFO
40-43Virtual Channel
44-55Packet number
56-63BTF: SOF=0xAA or SOC=0xCC

K-Code: EOF/EOC

EOF (End of Frame) or EOC (End of Cell) codes are sent when the start of data payload transmission

Bit(s)Name
0-7TLAST USER
8-16Reserved (zeros)
16-19Last byte count
20-23Reserved (zeros)
24-5532-bit CRC
56-63BTF: EOF=0x55 or EOC=0x33

Data Cells

Data frames received on the user AXI-Stream interface are broken into Cells of at most 128 words each. The first cell of a frame in indicated by the SOF (start-of-frame) character. Subsequent cells belonging to the same frame begin with the SOC character to indicate that they are a continuation of frame data. SOF/SOC characters also contain the Virtual Channel number, a sequence field to check whether Cells have been dropped, and the LINKINFO data to indicate flow control status. Cells are terminated with an EOF character to indicate that a cell is the last of a frame, or and EOC character to indicate that more data is expected from the current frame. EOF/EOC characters also contain a 32-bit CRC that is computed over all the data in a cell (excluding the SOF/SOC), with the CRC from the previous cell of the frame used as the starting value for the new CRC calculation. The CRC polynomial is identical to that used for Ethernet and Aurora 64b/66b:

G(x)=x32+x26+ x23+x22+x16+x12+x11+x10+x8+x7+x5+x4+x2+x+1

User Opcodes

The Opcode interface allows for 48-bit user opcodes to be transmitted sideband of any Virtual Channel data frames. Opcode transmission takes priority over frame data transmission. Opcodes are contained in a single K-Code, and may therefore be placed in the middle of a cell sequence. Each opcode can also be assigned to one of 8 opcode channels, so that opcodes directed toward different logic units may be multiplex together. Opcode K-Chars also contain an 8-bit checksum, calculated as the inverted sum of each of the 6 bytes of the opcode being transmitted.

Startup Sequence

The transmit logic begins by sending at least 1000 IDLE characters. It then continues to send only IDLE and SKIP codes until flow control indicates that the receive logic on the other end has locked on and aligned to the data stream (RXREADY). This assures that the receive logic will see a “10” sequence exactly every 66 bits for as long as necessary until it can establish proper word alignment. A word alignment state machine in the receive logic allows bits  to slip in the deserializer until it sees 128 words in a row that begin with the “10” control sequence, at which point it is locked. At this point, the RXREADY signal is asserted and sent in every LINKINFO message to the other side, allowing the other side to begin transmitting user data.

Unidirectional Mode

The PGP3 protocol normally relies on a full-duplex link in order for flow control information to be passed from receiver back to sender. Some system architectures however simply cannot support bidirectional link lines, and so PGP3 includes a unidirectional mode. In this mode, the system essentially operates without flow control, with the transmit side assuming that the receive side always has buffer space available for new data. Care must be taken then that the receive side will always process incoming data fast enough to keep up with the maximum data transmission rate.

User Interface

PGP3 supports up to 16 Virtual Channels, each with its own discrete 64-bit wide bidirectional AXI-Stream interface. Through these channels, frame-based data is multiplexed through the link.. Virtual channels are selected for transmission in round-robin priority, so that each channel has equal access to the available link bandwidth. Interleaving is enabled by default, so that once a Cell of 128 words from a frame on one channel have been accepted, the next Cell will be taken from a different channel that is requesting transmission. This setting is highly recommended so that long frames on one Virtual Channel do not starve out all of the other channels. Data frames send on the Virtual Channel may be any number of bytes. The interface supports the AXI-Stream TKEEP signal on the final transaction of a frame to indicate a number of bytes between 0 and 8.

Implementation

The PGP3 protocol is implemented in synthesizable VHDL. It is open source, released under a permissive modified BSD license. It is included in the SURF firmware library from SLAC, available on Github at github.com/slaclab/surf.

The number of Virtual Channels supported can be configured via VHDL generics, allowing for better resource utilization when fewer channels are needed. Total resource utilization depends on the number of Virtual Channels synthesized and the amount of buffering required per channel. The “core” of the PGP3 protocol with all 16 channels synthesized, takes up the following resources on Xilinx FPGAs:

Contact

Ben Reese

bareese@slac.stanford.edu

  • No labels