Confluence will be unusable 23-July-2024 at 06:00 due to a Crowd upgrade.
Table of Contents |
---|
Note: this should only be necessary for expert developers. Send mail to pcds-ana-l if you have questions.
Apply for an account at this link:
http://www.nersc.gov/users/accounts/user-accounts/get-a-nersc-account/
Use the following information for the various fields:
(courtesy of Anton Barty)
Currently this is only possible for "early access users" who have accounts at NERSC.
Example slurm batch-job script submitted with "sbatch <scriptname>" ("srun" is the cray-equivalent of "mpirun"). These examples can be found at https://github.com/monarin/psana-nersc.git in "psana1/submit.sh" and "psana1/run_nersc.sh":
Code Block |
---|
#!/bin/bash -l #SBATCH -p regular-account=lcls #SBATCH --job-name=lcls-py2-root #SBATCH --N nodes=1 #SBATCH -t 01:00:00 #SBATCH -A lcls-constraint=knl #SBATCH --time=00:15:00 #SBATCH --image=docker:registry.services.nersc.gov/psana:ana-0.17.4a module load shifter cd $HOME/shifterslaclcls/lcls-py2-root:latest #SBATCH --exclusive #SBATCH --qos=regular t_start=`date +%s` export PMI_MMAP_SYNC_WAIT_TIME=600 srun -n 68 -c 324 shifter ./myjob.cshrun_nersc.sh t_end=`date +%s` echo PSJobCompleted TotalElapsed $((t_end-t_start)) $t_start $t_end |
Where run_nersc.sh Where my job.csh looks like the usual psana-python command:
Code Block |
---|
#!/bin/tcshbash # activate psana environment source /reg/g/psdm/etc/ana_env.csh cd $HOME/shifter setenvimg/conda.local/env.sh source activate psana_base # set location for experiment db and calib dir export SIT_DATA=$CONDA_PREFIX/data export SIT_PSDM_DATA =/global/projectcscratch1/projectdirssd/lclspsdatmgr/gdata/psdm/ python psana_io_benchmark.py exp=cxig3614:run=81 |
Note: this should only be necessary for expert developers. Send mail to pcds-ana-l if you have questions.
Apply for an account at this link:
http://www.nersc.gov/users/accounts/user-accounts/get-a-nersc-account/
Use the following information for the various fields:
(courtesy of Anton Barty)
# prevent crash when running on one core
export HDF5_USE_FILE_LOCKING=FALSE
python mpiDatasource.py |
To run a shorter "interactive" session (very useful for debugging since you don't have to wait for a batch job to start after fixing each typo:
Code Block |
---|
monarin@cori02: salloc -C knl -N 1 -t 1:00:00 -q interactive -A lcls --image=docker:slaclcls/lcls-py2-root:latest
salloc: Pending job allocation 32421205
salloc: job 32421205 queued and waiting for resources
salloc: job 32421205 has been allocated resources
salloc: Granted job allocation 32421205
salloc: Waiting for resource configuration
salloc: Nodes nid02346 are ready for job
monarin@nid02346: srun -n 3 shifter ./run.sh
2
0
1
monarin@nid02346: cat run.sh
#!/bin/bash
source /img/conda.local/env.local
source activate psana_base
python test_mpi.py |
And another approach that gets you a prompt "inside" the shifter container's conda environment:
Code Block |
---|
(login to a cori login node, then execute this command which allocates a node for you to use for 1 hour)
salloc -C knl -N 1 -t 1:00:00 -q interactive -A lcls --image=docker:slaclcls/lcls-py2-root:latest
(once that command completes)
shifter /bin/bash (get a shell in the shifter image)
source /img/conda.local/env.sh (setup conda)
source activate psana_base (activate the appropriate conda environment)
export SIT_PSDM_DATA=/global/cscratch1/sd/psdatmgr/data/psdm
(psana_base) cpo@nid02387:~$ more ~/junk.py
from psana import *
dsource = MPIDataSource('exp=mfx11116:run=602:dir=/global/cscratch1/sd/psdatmgr/data/psdm/MFX/mfx11116/xtc:smd')
det = Detector('Jungfrau1M')
for nevt,evt in enumerate(dsource.events()):
calib = det.calib(evt)
if calib is None:
print 'none'
else:
print calib.shape
if nevt>5: break
(psana_base) cpo@nid02387:~$ python junk.py
(2, 512, 1024)
(2, 512, 1024)
(2, 512, 1024)
(2, 512, 1024)
(2, 512, 1024)
(2, 512, 1024)
(2, 512, 1024)
(psana_base) cpo@nid02387:~$
|
...