The setup for BLD testing is as follows


The crate contains 3 application boards: slot 3,4, and 6. All of them are configured with the AmcCarrierTpr firmware. The linuxRT server contains 6 processors, and is running three instantiations of the AmcCarrierTprTst IOC (bsssBld branch); one corresponding to each of the application boards. This IOC has BSSS/BLD/BSAS/BSA integrated, and ready to go.

Calculating the worst case scenario of BLD traffic from the ATCA crate to the LinuxRT server:

@1Mhz, BLD transmits 28 events in a single packet

  • Payload:  3936 bytes
  • Header: 42 bytes
  • For 1000000 events/second, 35714.28 packets are required each second


BLD bandwidth = 35714.28 x ( 3936 + 42 ) = 135.48 MBytes/s = 1.058 Gbits/s

Having three application boards running at the same time @ 1MHz, we get a 3.18 Gbps.

Outgoing BLD multicast packets on the other hand, have a different format, and have a smaller payload. The packet size is as follows

  • Payload : 3824
  • Header : 42
  • If no destination is chosen, then each BLD EDEF will generate a multicast packet (4 multicast packets for each 28 events). Nonetheless, this may not be the intended use, however it is the worst case.
    • 35714.28 packets x 4 = 142857.12 multicast packets

Multicast bandwidth = 142857.12 x ( 3824 + 42 ) = 526.7 MB/s = 4.114 Gb/s

Having three application boards running at the same time @ 1MHz, we get 12.34 Gbps.

Note the link of 1Gbps (limitation) between the LinuxRT server and the switch. To run at the speeds above, we would need an upstream link with bandwidth larger than 12.34 Gbps.

Setup

In the setup, all BLD EDEFs were activated to the maximum frequency (1MHz). BSSS was enabled on a speed of 10Hz on all data channels. BSAS was verified to be working. BSA was operated manually.

In these tests we worked with what we have including:

  • Server load
  • Ethernet interface counters
  • Sniffer bandwidth estimates
  • Test sanity of the different functionalities
  • BLD Invalid status detect

Server load

Keeping in mind that the BLD multicast transmission packets are failing, The three IOC applications processor consumption seems to be reasonable as follows

  • IOC sioc-b084-ts03 (slot 3): 
    • CPU usage : 68.9%
    • Memory usage : 8.4%
  • IOC sioc-b084-ts04 (slot 4): 68.2%
    • CPU usage : 68.9%
    • Memory usage : 8.4%
  • IOC sioc-b084-ts05 (slot 6): 43.0%
    • CPU usage : 68.9%
    • Memory usage : 8.4%

Total memory usage : 

  • Before running IOC : 10GB
  • After running IOC: 42GB

Ethernet interface counters

Examining the interface, no packet drops were observed.

Sniffer bandwidth estimates

The sniffer estimates the following bandwidth 

  • 10.0.1.103 (slot 3): 1.07Gbps 
  • 10.0.1.104 (slot 4): 1.07Gbps 
  • 10.0.1.106 (slot 6): 1.15Gbps

Seems consistent with the above calculations. 

Test sanity of the different functionalities

BSSS/BSA and BSAS PVs were examined, and they seem to be coherent. The BLD multicast packet was also examined and seems to be coherent with reverse engineered “specifications”.

BLD Invalid status detect

A camonitor was performed on the status of all 3 IOCs for 8 hours. Several Invalid statuses were observed as follows. 

  • Slot 3: 6 occurrences
  • Slot 4: 14 occurrences
  • Slot 6: 9 occurrences

These results were obtained by camonitor. The true number of occurrences is equal or larger than this number.

BLD test with different number of channels

BLD EDEF configurations were as follows:

caput TST:SYS2:3:EDEF4:MULT_PORT 50000
caput TST:SYS2:3:EDEF4:MULT_ADDR 239.255.4.3
caput TST:SYS2:3:EDEF3:MULT_ADDR 239.255.4.3
caput TST:SYS2:3:EDEF3:MULT_PORT 50000
caput TST:SYS2:3:SCSBR:MULT_ADDR 239.255.4.3
caput TST:SYS2:3:SCSBR:MULT_PORT 50000
caput TST:SYS2:3:SCHBR:MULT_PORT 50000
caput TST:SYS2:3:SCHBR:MULT_ADDR 134.79.216.240

When running all four EDEFs on the possible maximum rate (71.5KHz), the BLD thread CPU usage is as low as %4. While the IOC total CPU usage is %40 when BLD is activated and %20 when BLD is disabled. %20 can be attributed to BLD. While we are not sure, Kukhee and I presume that the remaining %16 (20-4) is probably spent in the kernel when transmitting the data over UDP. 

Of course testing with 1MHz rate generates a lot of overruns, and we catch the BLD thread sleeping using PS (wchan is set to sock_alloc_send_pskb). We suspect that while the link is the bottleneck, the kernel spends a lot of time sending the UDP packets

The following table all 4 EDEFs were enabled at 1MHz

# of channels

Estimated BLD required upstream bandwidthpass fail
3736MbpsPass
4858MbpsPass
5981MbpsFail

There is sufficient reason to believe that the network interface is the bottleneck and not the CPU as the failure happens when 1Gbps is reached.

  • No labels