Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • data are available at NERSC in this directory (the equivalent of /reg/d/psdm): /global/project/projectdirs/lcls/d/psdm/.  Set environment variable SIT_PSDM_DATA to this location so psana will be able to locate the data
  • ssh to cori.nersc.gov (the equivalent of a pslogin node)
  • information on the cori batch system ("slurm") is here: https://docs.nersc.gov/jobs/
  • there are 32 cores on each cori node
  • to get to the equivalent of a "psana" node you should run an "interactive job" as described here: https://docs.nersc.gov/jobs/interactive/

Example slurm batch-job script submitted with "sbatch <scriptname>" ("srun" is the cray-equivalent of "mpirun").  These examples can be found at https://github.com/monarin/psana-nersc.git in "psana1/submit.sh"

Code Block
#!/bin/bash -l
#SBATCH -p regular-account=lcls
#SBATCH --job-name=lcls-py2-root
#SBATCH --N nodes=1
#SBATCH -t 01:00:00
#SBATCH -A lcls-constraint=knl
#SBATCH --time=00:15:00
#SBATCH --image=docker:registry.services.nersc.gov/psana:ana-0.17.4a
module load shifter
cd $HOME/shifterslaclcls/lcls-py2-root:latest
#SBATCH --exclusive
#SBATCH --qos=regular
t_start=`date +%s`
export PMI_MMAP_SYNC_WAIT_TIME=600
srun -n 68 -c 324 shifter ./myjob.shrun_nersc.sh
t_end=`date +%s`
echo PSJobCompleted TotalElapsed $((t_end-t_start)) $t_start $t_end

Where run_nerscWhere my job.sh looks like the usual psana-python command:

Code Block
#!/bin/bash
# activate psana environment
source /reg/g/psdm/etc/ana_env.sh
cd $HOME/shifterimg/conda.local/env.sh
source activate psana_base

# set location for experiment db and calib dir
export SIT_DATA=$CONDA_PREFIX/data
export SIT_PSDM_DATA=/global/project/projectdirs/lcls/g/psdm/
python psana_io_benchmark.py exp=cxig3614:run=81/cscratch1/sd/psdatmgr/data/psdm

# prevent crash when running on one core
export HDF5_USE_FILE_LOCKING=FALSE

python mpiDatasource.py

To run a shorter "interactive" session (very useful for debugging since you don't have to wait for a batch job to start after fixing each typo:

Code Block
(login to a cori login node, then execute this command which allocates a node for you to use for 1 hour)
salloc -C knl -N 1 -t 1:00:00 -q interactive -A m2859 --image=docker:slaclcls/lcls-py2-root:latest

(once that command completes)
shifter /bin/bash
source /img/conda.local/env.sh
source activate psana_base
export SIT_PSDM_DATA=/global/cscratch1/sd/psdatmgr/data/psdm

(psana_base) cpo@nid02387:~$ more ~/junk.py
from psana import *
dsource = MPIDataSource('exp=mfx11116:run=602:dir=/global/cscratch1/sd/psdatmgr/data/psdm/MFX/mfx11116/xtc:smd')
det = Detector('Jungfrau1M')
for nevt,evt in enumerate(dsource.events()):
   calib = det.calib(evt)
   if calib is None:
      print 'none'
   else:
      print calib.shape
   if nevt>5: break
(psana_base) cpo@nid02387:~$ python junk.py
(2, 512, 1024)
(2, 512, 1024)
(2, 512, 1024)
(2, 512, 1024)
(2, 512, 1024)
(2, 512, 1024)
(2, 512, 1024)
(psana_base) cpo@nid02387:~$ 

Getting An Account

Note: this should only be necessary for expert developers.  Send mail to pcds-ana-l if you have questions.

...