Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • user "systemctl list-unit-files" to see if tdetsim.service or kcu.service is enabled
  • associated .service files are in /usr/lib/systemd/system/
  • to see if events are flowing from the hardware to the software: "cat /proc/datadev_0" and watch "Tot Buffer Use" counter under "Read Buffers"
  • if you see the error "rogue.GeneralError: AxiStreamDma::AxiStreamDma: General Error: Failed to map dma buffers. Increase vm map limit: sysctl -w vm.max_map_count=262144".  This can be caused by that parameter being too low, but it can also be caused by loading the driver with CfgMode=1 in tdetsim.service (we use CfgMode=2).  This CfgMode parameter has to do with the way memory gets cached.
    • make sure the tdetsim.service is the same as another working node
    • make sure that the appropriate service is enabled (see first point)
    • check that the driver in /usr/local/sbin/datadev.ko is the same as a working node
    • check that /etc/sysctl.conf has the correct value of vm.max_map_count
    • vm.max_map_count should be at least 4K larger than the datadev service's CfgRxCount value
  • we have also seen this error when the datadev.ko buffer sizes and counts were configured incorrectly in tdetsim.service.
  • For high rate running, many DMA buffers (cfgRxCount parameter) are needed to avoid deadtime. The number of DMA buffers is also used to size the DRPs' pebble, so slow but large DRPs like PvaDetector and EpicsArch need a comparatively small cfgRxCount value.  Neglecting to take this into account can result in attempting to allocate more memory than the machine has, resulting in the DRP throwing std::bad_alloc.
    • The number of buffers returned by dmaMapDma() is 4 greater than the cfgRxCount value.  The DRP code rounds this up to the next power of 2 (if it's not already a power of 2) and uses the result to allocate the number of pebble buffers.  To avoid wasting half the allocation, set cfgRxCount to 4 less than a power of 2.
  • DMA buffers can be small (cfgSize parameter) for most (all?) tdetsim applications.  16 KB is usually sufficient.

An (updated) example tdetsimkcu.service for an opal cameraHSDs:

Code Block
drp-nehsrcf-cmp012cmp005:~$ morecat /usr/lib/systemd/system/tdetsimkcu.service
[Unit]
Description=SimCamKCU1500 Device Manager
Requires=multi-user.target
After=multi-user.target

[Service]
Type=simpleforking
ExecStartExecStartPre=-/usr/local/sbin/kcuSim -s -d /dev/datadev_0
ExecStartPostrmmod datadev.ko
ExecStart=/sbin/insmod /usr/local/sbin/datadev.ko cfgTxCount=4 cfgRxCount=40
961048572 cfgSize=0x2100008192 cfgMode=0x2
ExecStop=
KillMode=process
ExecStartPost=/usr/local/sbin/kcuStatus -I
#ExecStartPost=/usr/bin/sh -c "/usr/bin/echo 4 > /proc/irq/368/smp_affinity_list"
ExecStartPost=/usr/bin/sh -c "/usr/bin/echo 4 > /proc/irq/369/smp_affinity_list"
ExecStartPost=/usr/bin/sh -c "/usr/bin/echo 5 > /proc/irq/370/smp_affinity_list"
KillMode=none
IgnoreSIGPIPE=no
StandardOutput=syslog
StandardError=inherit

[Install]
WantedBy=multi-user.target

Since 6/8/22, the DAQ code has been updated to eliminate a separate buffer allocation mechanism (including the associated wait state when the pool is empty) for the small input and result data for/from the TEB.  These buffers are now allocated using the same index with which the pebble is allocated.  Since this index is now shared with the TEB, this change has put constraints on its range across DRPs.

Interrupt Coalescing

We think this can help with errors like:

...