Contents

Miscellaneous Parameters

rmem_max
# echo 268435456 > /proc/sys/net/core/rmem_max
vm.max_map_count

For the pgp driver this parameter needs to be increased in /etc/sysctl.conf

# grep vm /etc/sysctl.conf 
vm.max_map_count=1000000
#

DAQ Setup of DSS/FFB Nodes (LCLS-I)

Link here

CPU Frequency Governor

All daq nodes should run the cpu frequency governor in either "ondemand" or "performance" mode.

As of this writing (May 2, 2022) the daq node daq-xpp-cam02 is not running the cpu frequency governor in "performance" mode.
It appears to be running in "ondemand" mode, which  "tries to use the slowest speed as much as possible, but quickly switches up or down when needed."

The epix10a2m requires "performance" mode for correct operation.

May 23, 2022 email from Dan Damiani
Hi,
For example in /cds/group/pcds/dist/pds/boot/daq-det-pgp01 we have the following:
for f in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do
  echo "performance" > $f
done
This is set for all the machines that are running an epix10ka2m:
daq-det-pgp01
daq-mec-pgp02
daq-mfx-pgp01
daq-xcs-pgp02
daq-xpp-pgp02
Dan
XPP: WRONG scaling_governor setting on daq-xpp-dss03
$ hostname
daq-xpp-cam02
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
performance
$

===========

$ hostname
daq-xpp-dss03
$ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
powersave
powersave
powersave
powersave
powersave
powersave
$

As of this writing (May 4, 2022) two MEC pgp nodes differ in their scaling_governor settings.

MEC: INCONSISTENT scaling_governor settings
-bash-4.2$ hostname
daq-mec-pgp01
-bash-4.2$ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
powersave
powersave
powersave
powersave
powersave
powersave
powersave
powersave
powersave
powersave
powersave
powersave
powersave
powersave
powersave
powersave
-bash-4.2$

==============================

-bash-4.2$ hostname
daq-mec-pgp02
-bash-4.2$ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
-bash-4.2$
LCLS-I dss nodes are running in 'powersave' mode
-bash-4.2$ hostname
daq-xcs-dss03
-bash-4.2$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
powersave
-bash-4.2$ 
============
-bash-4.2$ hostname
daq-mec-dss01
-bash-4.2$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
powersave
-bash-4.2$
===========
daq-xpp-dss03
-bash-4.2$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
powersave
-bash-4.2$

See earlier notes on "CPU Freq Scaling" here.

Hyperthreading

As of this writing (May 6, 2022), hyperthreading is enabled on drp-neh-cmp001 and drp-neh-cmp007 (and perhaps elsewhere).

Arguments for hyperthreading

Ric Claus writes:

I can say that the state of the hyperthreading flag is not important for running with trigger rates of 71 KHz and below, but I’m not ready to say that we can dispense with it for 1 MHz running.

Arguments against hyperthreading

On May 6, 2022, at 5:52 PM, Perazzo, Amedeo <perazzo@slac.stanford.edu> wrote:

I had the rhino thought when we installed drp-srcf so we decided to enable hyperthreading (the AMD one, which has a different name) on the new system and we asked Chris (O'Grady) to give it another try. After Chris' tests we decided there was no value in hyperthreading, even on AMD, and we disabled it entirely on drp-srcf. Chris, do you remember?


On May 6, 2022, Chris O'Grady wrote:

I had forgotten, but you are correct Amedeo.  My results (for mpi-psana analysis) are here.

It looks like an early test suggested HT helped, but when I tried to reproduce it later on I couldn’t, so we disabled it.  Many years ago I also benchmarked some quantum-chemistry code with/without HT and it didn’t help there either


On March 26, 2015, Chris Ford wrote:

To: pcds-daq-l
Subject: SXR: hyperthreading enabled on daq-sxr-cam02 and '03?
 
Folks,
 
While testing the Andor camera  I learned that some DAQ nodes were known to work better than others for this USB-based device.
Tomy writes, "One example I remember is that daq-sxr-cam01 was okay to run for hours, but daq-sxr-cam{02,03} would hang after some minutes."
Today I took a closer look, and I noticed that hyperthreading seems to be enabled on daq-sxr-cam02 and '03, where Andor fails.
Hyperthreading seems to be *disabled* on daq-sxr-cam01, where Andor runs well.

  • No labels